subject:"Re\: \[PATCH 0\/4\]"

Jeff King  writes:

> I have tried to help with the actual problem, by identifying the root
> cause (that the trace code's strategy is not fundamentally broken, but
> that it relies on O_APPEND whose guarantees are obviously not portable
> to Windows) and exploring possible solutions there (FILE_APPEND_DATA).
>
> Lacking any kind of Windows development environment, I'd just as soon
> stop there and let more knowledgeable people take over. But please don't
> characterize my presence in this thread as some kind of willful harm or
> ignoring of the problem.

Thanks.

It seems like the other Johannes and the other Jeff are also
interested, so we may hopefully see a properly fixed O_APPEND,
which would be the real solution that would help appenders
that are *not* tracing subsystem, too.

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

Johannes Sixt  writes:

> As this buglet looks like a recurring theme, and a proper fix is
> preferable over repeated work-arounds. To me it looks like we need
> some sort of locking on Windows. Unless your friends at Microsoft have
> an ace in their sleeves that lets us have atomic O_APPEND the POSIX
> way...

Just to put the severity of the issue in context, we use O_APPEND in
a few codepaths, and the trace thing for debugging is the only thing
that could have multiple writers.  Other users of O_APPEND are:

 * refs/files-backend.c uses it so that a reflog entry can be
   appended at the end, but because update to each ref is protected
   from racing at a lot higher layer with a lock, no two people
   would try to append to the same reflog file, so atomicity of
   O_APPEND does not matter here.

 * sequencer.c wants to use it when moving one insn from the todo
   list to the 'done' list when it finishes one operation.  If you
   are running two sequences in a single repository, intermixed
   'done' list would be the least of your problem, so presumably we
   are fine here.

It may make sense to allow GIT_TRACE to have a placeholder
(e.g. "/tmp/traceout.$$") to help debuggers arrange to give
different processes their own trace output file, which perhaps may
be a simpler and less impactful to the performance solution than
having to make locks at an application layer.

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

2018-08-10 Thread Jeff King

On Fri, Aug 10, 2018 at 06:43:07PM +0200, Johannes Schindelin wrote:

> > > Correct.  But I am more worried about the "mixed/overwriting"
> > > breakage, if there is one; it means we may need to be prepared for
> > > systems that lack O_APPEND that works correctly.  I initially just
> > > assumed that it was what Dscho was seeing, but after re-reading his
> > > message, I am not sure anymore.
> > > 
> > > I think the "do not trace the other side" approach you suggest for
> > > these tests that only care about one side is more appropriate
> > > solution for this particular case.  We then do not have to worry
> > > about overwriting or output from both sides mixed randomly.
> 
> Just so everybody knows: I never received the above two mails. Something
> is seriously rotten in GMX. This seems to have started a week or two ago.

I think it may just have been that they went to the gitgitgadget
address; I explicitly added you as a cc when I sent the patch with a
commit message, which was when I first noticed you weren't cc'd on the
earlier bits.

> What you cannot see easily, unless you go the route that I offered
> Jonathan (open a PR on gitgitgadget/git which will automatically run the
> test suite on Windows, macOS and Linux, and of course you can do anything
> in a PR, including narrowing down what tests are run) is that sometimes
> those lines in the `trace` file were clearly *incomplete*. That is, even
> if both processes tried to write atomically, one managed to overwrite the
> other's buffer in flight.
> 
> This is a pretty serious thing, even worse than the failing test suite,
> and I don't think that your patch acknowledges how serious this is.

I don't see anything in what you wrote above or in your original cover
letter or series that contradicts the notion that O_APPEND is simply not
atomic on Windows. Overwriting the same byte ranges in the file is what
I would expect on Linux, as well, if the file were opened without
O_APPEND.

I have mixed reports from searching online whether an equivalent to
O_APPEND exists. Some sources seem to say that FILE_APPEND_DATA is.  I
couldn't find anything definitive in Microsoft documentation about
whether it actually provides atomicity.  From my (admittedly limited)
digging, that does not seem to be used by the msys2 runtime, so it might
be worth exploring.

> And please don't give me that "but it works on Linux" response. Seriously.
> Sheesh.

But it does. And it's designed to. There is literally nothing to fix on
Linux.

> Even if you have not managed to trigger it, the POSIX standard seems not
> to define clearly what the behavior is of two competing write() calls,
> unless you lock the files appropriately, as my patch series does.

POSIX says:

  If the O_APPEND flag of the file status flags is set, the file offset
  shall be set to the end of the file prior to each write and no
  intervening file modification operation shall occur between changing
  the file offset and the write operation.

and:

  This volume of POSIX.1-2017 does not specify the behavior of
  concurrent writes to a regular file from multiple threads, except that
  each write is atomic.

I admit the writing is a little turgid, but I think between the two it
is promising the behavior that we expect (and I think that
interpretation is pretty common in the Unix world).

> So unless you are willing to ignore, to willfully keep this breakage, I
> would suggest not to introduce the ugliness of an overridden upload-pack
> for the sole purpose of disabling the tracing on one side, but instead to
> get this here bug fixed, by helping me with this here patch series.

I'm OK if you want to live with the broken test in the interim. But
after your complaining about the imminence of it "infecting" master, I
thought you might be happy to see a solution that was more immediate.

I have tried to help with the actual problem, by identifying the root
cause (that the trace code's strategy is not fundamentally broken, but
that it relies on O_APPEND whose guarantees are obviously not portable
to Windows) and exploring possible solutions there (FILE_APPEND_DATA).

Lacking any kind of Windows development environment, I'd just as soon
stop there and let more knowledgeable people take over. But please don't
characterize my presence in this thread as some kind of willful harm or
ignoring of the problem.

-Peff

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

2018-08-10 Thread Johannes Sixt


Am 10.08.2018 um 18:51 schrieb Jeff Hostetler:



On 8/10/2018 12:15 PM, Johannes Sixt wrote:

Am 09.08.2018 um 19:35 schrieb Johannes Schindelin via GitGitGadget:
I reported a couple of times that t5552 is not passing reliably. It 
has now

reached next, and will no doubt infect master soon.

Turns out that it is not a Windows-specific issue, even if it occurs 
a lot

more often on Windows than elsewhere.

The culprit is that two processes try simultaneously to write to the 
same

file specified via GIT_TRACE_PACKET, and it is not well defined how that
should work, even on Linux.


Thanks for digging down to the root cause. As has been said, the 
behavior of O_APPEND is well-defined under POSIX, but last time I 
looked for equivalent feature on Windows, I did not find any.


Last time was when I worked around the same failure in 
t5503-tagfollow.sh in my private builds:


https://github.com/j6t/git/commit/9a447a6844b50b43746d9765b3ac809e2793d742 



It is basically the same as Peff suggests: log only one side of the 
fetch.


As this buglet looks like a recurring theme, and a proper fix is 
preferable over repeated work-arounds. To me it looks like we need 
some sort of locking on Windows. Unless your friends at Microsoft have 
an ace in their sleeves that lets us have atomic O_APPEND the POSIX 
way...


The official Windows CRT for open() does document an O_APPEND, but after
looking at the distributed source, I'm not sure I trust it.


I looked at the CRT code, too, and no, we can't trust it.


I have not looked at the MSYS version of the CRT yet so I cannot comment
on it.

I did find that the following code does what we want.  Notice the use of
FILE_APPEND_DATA in place of the usual GENERIC_READ.  (And yes, the docs
are very vague here.)


As far as I understand, FILE_APPEND_DATA is just a permission flag (I'm 
not sure why that would be necessary), but at least the documentation 
states that it is still necessary for the caller to set the file pointer.


But I haven't tried your code below, yet. Should it append the line of 
text on each invocation? And if so, is it an atomic operation?



Before we try a locking solution, perhaps we should tweak mingw_open()
to use something like this to get a proper HANDLE directly and then use
_open_osfhandle() to get a "fd" for it.

Jeff

-

#include 

int main()
{
 HANDLE h = CreateFileW(L"foo.txt",
     FILE_APPEND_DATA,
     FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
     NULL, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);

 const char * buf = "this is a test\n";
 DWORD len = strlen(buf);
 DWORD nbw;

 WriteFile(h, buf, len, , NULL);
 CloseHandle(h);

 return 0;
}

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

2018-08-10 Thread Jeff Hostetler





On 8/10/2018 12:51 PM, Jeff Hostetler wrote:



On 8/10/2018 12:15 PM, Johannes Sixt wrote:

Am 09.08.2018 um 19:35 schrieb Johannes Schindelin via GitGitGadget:
I reported a couple of times that t5552 is not passing reliably. It 
has now

reached next, and will no doubt infect master soon.

Turns out that it is not a Windows-specific issue, even if it occurs 
a lot

more often on Windows than elsewhere.

The culprit is that two processes try simultaneously to write to the 
same

file specified via GIT_TRACE_PACKET, and it is not well defined how that
should work, even on Linux.


Thanks for digging down to the root cause. As has been said, the 
behavior of O_APPEND is well-defined under POSIX, but last time I 
looked for equivalent feature on Windows, I did not find any.


Last time was when I worked around the same failure in 
t5503-tagfollow.sh in my private builds:


https://github.com/j6t/git/commit/9a447a6844b50b43746d9765b3ac809e2793d742 



It is basically the same as Peff suggests: log only one side of the 
fetch.


As this buglet looks like a recurring theme, and a proper fix is 
preferable over repeated work-arounds. To me it looks like we need 
some sort of locking on Windows. Unless your friends at Microsoft have 
an ace in their sleeves that lets us have atomic O_APPEND the POSIX 
way...


The official Windows CRT for open() does document an O_APPEND, but after
looking at the distributed source, I'm not sure I trust it.

I have not looked at the MSYS version of the CRT yet so I cannot comment
on it.

I did find that the following code does what we want.  Notice the use of
FILE_APPEND_DATA in place of the usual GENERIC_READ.  (And yes, the docs
are very vague here.)


d'oh

s/GENERIC_READ/GENERIC_WRITE/



Before we try a locking solution, perhaps we should tweak mingw_open()
to use something like this to get a proper HANDLE directly and then use
_open_osfhandle() to get a "fd" for it.

Jeff

-

#include 

int main()
{
 HANDLE h = CreateFileW(L"foo.txt",
     FILE_APPEND_DATA,
     FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
     NULL, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);

 const char * buf = "this is a test\n";
 DWORD len = strlen(buf);
 DWORD nbw;

 WriteFile(h, buf, len, , NULL);
 CloseHandle(h);

 return 0;
}

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

2018-08-10 Thread Jeff Hostetler





On 8/10/2018 12:15 PM, Johannes Sixt wrote:

Am 09.08.2018 um 19:35 schrieb Johannes Schindelin via GitGitGadget:
I reported a couple of times that t5552 is not passing reliably. It 
has now

reached next, and will no doubt infect master soon.

Turns out that it is not a Windows-specific issue, even if it occurs a 
lot

more often on Windows than elsewhere.

The culprit is that two processes try simultaneously to write to the same
file specified via GIT_TRACE_PACKET, and it is not well defined how that
should work, even on Linux.


Thanks for digging down to the root cause. As has been said, the 
behavior of O_APPEND is well-defined under POSIX, but last time I looked 
for equivalent feature on Windows, I did not find any.


Last time was when I worked around the same failure in 
t5503-tagfollow.sh in my private builds:


https://github.com/j6t/git/commit/9a447a6844b50b43746d9765b3ac809e2793d742

It is basically the same as Peff suggests: log only one side of the fetch.

As this buglet looks like a recurring theme, and a proper fix is 
preferable over repeated work-arounds. To me it looks like we need some 
sort of locking on Windows. Unless your friends at Microsoft have an ace 
in their sleeves that lets us have atomic O_APPEND the POSIX way...


The official Windows CRT for open() does document an O_APPEND, but after
looking at the distributed source, I'm not sure I trust it.

I have not looked at the MSYS version of the CRT yet so I cannot comment
on it.

I did find that the following code does what we want.  Notice the use of
FILE_APPEND_DATA in place of the usual GENERIC_READ.  (And yes, the docs
are very vague here.)

Before we try a locking solution, perhaps we should tweak mingw_open()
to use something like this to get a proper HANDLE directly and then use
_open_osfhandle() to get a "fd" for it.

Jeff

-

#include 

int main()
{
HANDLE h = CreateFileW(L"foo.txt",
FILE_APPEND_DATA,
FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
NULL, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);

const char * buf = "this is a test\n";
DWORD len = strlen(buf);
DWORD nbw;

WriteFile(h, buf, len, , NULL);
CloseHandle(h);

return 0;
}

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

2018-08-10 Thread Johannes Schindelin

Hi,

On Fri, 10 Aug 2018, Jeff King wrote:

> On Thu, Aug 09, 2018 at 01:49:52PM -0700, Junio C Hamano wrote:
> 
> > Jeff King  writes:
> > 
> > > Are you sure that it's not well-defined? We open the path with O_APPEND,
> > > which means every write() will be atomically positioned at the end of
> > > file. So we would never lose or overwrite data.
> > >
> > > We do our own buffering in a strbuf, writing the result out in a single
> > > write() call (modulo the OS returning a short write, but that should not
> > > generally happen when writing short strings to a file). So we should get
> > > individual trace lines as atomic units.
> > >
> > > The order of lines from the two processes is undefined, of course.
> > 
> > Correct.  But I am more worried about the "mixed/overwriting"
> > breakage, if there is one; it means we may need to be prepared for
> > systems that lack O_APPEND that works correctly.  I initially just
> > assumed that it was what Dscho was seeing, but after re-reading his
> > message, I am not sure anymore.
> > 
> > I think the "do not trace the other side" approach you suggest for
> > these tests that only care about one side is more appropriate
> > solution for this particular case.  We then do not have to worry
> > about overwriting or output from both sides mixed randomly.

Just so everybody knows: I never received the above two mails. Something
is seriously rotten in GMX. This seems to have started a week or two ago.

> Here it is as a patch on top of jt/fetch-negotiator-skipping, which lets
> us pursue any fix for interleaved trace output on Windows without the
> pressure of an impending flaky test.
> 
> My gut says that looking into making O_APPEND work there is going to be
> the nicest solution, but my gut is not very well versed in Windows
> subtleties. ;)

Your patch seems to be a patch in the original sense of the word: it is
not a solution.

And let me spell it out very clearly (as I have mentioned to Jonathan a
couple times, without any response): t5552 fails regularly, way more than
rarely, see for yourself:

https://git-for-windows.visualstudio.com/git/_build/index?definitionId=1&_a=history

And the reason it fails was analyzed by me and described in the commit
message (I am sorry that the cover letter was still stale and talked
about flock(), which I had decided was not the right approach after
fighting with Linux over it).

What you cannot see easily, unless you go the route that I offered
Jonathan (open a PR on gitgitgadget/git which will automatically run the
test suite on Windows, macOS and Linux, and of course you can do anything
in a PR, including narrowing down what tests are run) is that sometimes
those lines in the `trace` file were clearly *incomplete*. That is, even
if both processes tried to write atomically, one managed to overwrite the
other's buffer in flight.

This is a pretty serious thing, even worse than the failing test suite,
and I don't think that your patch acknowledges how serious this is.

And please don't give me that "but it works on Linux" response. Seriously.
Sheesh.

Even if you have not managed to trigger it, the POSIX standard seems not
to define clearly what the behavior is of two competing write() calls,
unless you lock the files appropriately, as my patch series does.

So unless you are willing to ignore, to willfully keep this breakage, I
would suggest not to introduce the ugliness of an overridden upload-pack
for the sole purpose of disabling the tracing on one side, but instead to
get this here bug fixed, by helping me with this here patch series.

We don't have to rush this, you know? We have to fix it, though.

Thank you,
Dscho

> -- >8 --
> Subject: [PATCH] t5552: suppress upload-pack trace output
> 
> The t5552 test script uses GIT_TRACE_PACKET to monitor what
> git-fetch sends and receives. However, because we're
> accessing a local repository, the child upload-pack also
> sends trace output to the same file.
> 
> On Linux, this works out OK. We open the trace file with
> O_APPEND, so all writes are atomically positioned at the end
> of the file. No data can be overwritten or omitted. And
> since we prepare our small writes in a strbuf and write them
> with a single write(), we should see each line as an atomic
> unit. The order of lines between the two processes is
> undefined, but the test script greps only for "fetch>" or
> "fetch<" lines. So under Linux, the test results are
> deterministic.
> 
> The test fails intermittently on Windows, however,
> reportedly even overwriting bits of the output file (i.e.,
> O_APPEND does not seem to give us an atomic position+write).
> 
> Since the test only cares about the trace output from fetch,
> we can just disable the output from upload-pack. That
> doesn't solve the greater question of O_APPEND/trace issues
> under Windows, but it easily fixes the flakiness from this
> test.
> 
> Reported-by: Johannes Schindelin 
> Signed-off-by: Jeff King 
> ---
> I'm assuming that this really isn't

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

2018-08-10 Thread Johannes Sixt


Am 09.08.2018 um 19:35 schrieb Johannes Schindelin via GitGitGadget:

I reported a couple of times that t5552 is not passing reliably. It has now
reached next, and will no doubt infect master soon.

Turns out that it is not a Windows-specific issue, even if it occurs a lot
more often on Windows than elsewhere.

The culprit is that two processes try simultaneously to write to the same
file specified via GIT_TRACE_PACKET, and it is not well defined how that
should work, even on Linux.


Thanks for digging down to the root cause. As has been said, the 
behavior of O_APPEND is well-defined under POSIX, but last time I looked 
for equivalent feature on Windows, I did not find any.


Last time was when I worked around the same failure in 
t5503-tagfollow.sh in my private builds:


https://github.com/j6t/git/commit/9a447a6844b50b43746d9765b3ac809e2793d742

It is basically the same as Peff suggests: log only one side of the fetch.

As this buglet looks like a recurring theme, and a proper fix is 
preferable over repeated work-arounds. To me it looks like we need some 
sort of locking on Windows. Unless your friends at Microsoft have an ace 
in their sleeves that lets us have atomic O_APPEND the POSIX way...


-- Hannes

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

Jeff King  writes:

> Here it is as a patch on top of jt/fetch-negotiator-skipping, which lets
> us pursue any fix for interleaved trace output on Windows without the
> pressure of an impending flaky test.
>
> My gut says that looking into making O_APPEND work there is going to be
> the nicest solution, but my gut is not very well versed in Windows
> subtleties. ;)

As we know these tests are only interested in "fetch" lines and not
record of communication in the other direction, I think this is a
nice clean-up that is worth doing.

For only two grep's it may not look worth doing, but it would be
also a good clean-up to give an got_acked helper that sits next to
have_sent and have_not_sent helpers for readability.  That is of
course outside the scope of this change.

Will queue.  Thanks.

> -- >8 --
> Subject: [PATCH] t5552: suppress upload-pack trace output
>
> The t5552 test script uses GIT_TRACE_PACKET to monitor what
> git-fetch sends and receives. However, because we're
> accessing a local repository, the child upload-pack also
> sends trace output to the same file.
>
> On Linux, this works out OK. We open the trace file with
> O_APPEND, so all writes are atomically positioned at the end
> of the file. No data can be overwritten or omitted. And
> since we prepare our small writes in a strbuf and write them
> with a single write(), we should see each line as an atomic
> unit. The order of lines between the two processes is
> undefined, but the test script greps only for "fetch>" or
> "fetch<" lines. So under Linux, the test results are
> deterministic.
>
> The test fails intermittently on Windows, however,
> reportedly even overwriting bits of the output file (i.e.,
> O_APPEND does not seem to give us an atomic position+write).
>
> Since the test only cares about the trace output from fetch,
> we can just disable the output from upload-pack. That
> doesn't solve the greater question of O_APPEND/trace issues
> under Windows, but it easily fixes the flakiness from this
> test.
>
> Reported-by: Johannes Schindelin 
> Signed-off-by: Jeff King 
> ---
> I'm assuming that this really isn't triggerable on Linux. I tried and
> couldn't manage to get it to fail, and the reasoning above explains why.
> But I wasn't 100% clear that Dscho hadn't seen it fail on non-Windows.
>
>  t/t5552-skipping-fetch-negotiator.sh | 23 ++-
>  1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/t/t5552-skipping-fetch-negotiator.sh 
> b/t/t5552-skipping-fetch-negotiator.sh
> index 0a8e0e42ed..0ad50dd839 100755
> --- a/t/t5552-skipping-fetch-negotiator.sh
> +++ b/t/t5552-skipping-fetch-negotiator.sh
> @@ -28,6 +28,19 @@ have_not_sent () {
>   done
>  }
>  
> +# trace_fetch   [args]
> +#
> +# Trace the packet output of fetch, but make sure we disable the variable
> +# in the child upload-pack, so we don't combine the results in the same file.
> +trace_fetch () {
> + client=$1; shift
> + server=$1; shift
> + GIT_TRACE_PACKET="$(pwd)/trace" \
> + git -C "$client" fetch \
> +   --upload-pack 'unset GIT_TRACE_PACKET; git-upload-pack' \
> +   "$server" "$@"
> +}
> +
>  test_expect_success 'commits with no parents are sent regardless of skip 
> distance' '
>   git init server &&
>   test_commit -C server to_fetch &&
> @@ -42,7 +55,7 @@ test_expect_success 'commits with no parents are sent 
> regardless of skip distanc
>   # "c1" has no parent, it is still sent as "have" even though it would
>   # normally be skipped.
>   test_config -C client fetch.negotiationalgorithm skipping &&
> - GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
> + trace_fetch client "$(pwd)/server" &&
>   have_sent c7 c5 c2 c1 &&
>   have_not_sent c6 c4 c3
>  '
> @@ -65,7 +78,7 @@ test_expect_success 'when two skips collide, favor the 
> larger one' '
>   # the next "have" sent will be "c1" (from "c6" skip 4) and not "c4"
>   # (from "c5side" skip 1).
>   test_config -C client fetch.negotiationalgorithm skipping &&
> - GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
> + trace_fetch client "$(pwd)/server" &&
>   have_sent c5side c11 c9 c6 c1 &&
>   have_not_sent c10 c8 c7 c5 c4 c3 c2
>  '
> @@ -91,7 +104,7 @@ test_expect_success 'use ref advertisement to filter out 
> commits' '
>   # not need to send any ancestors of "c3", but we still need to send "c3"
>   # itself.
>   test_config -C client fetch.negotiationalgorithm skipping &&
> - GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch origin to_fetch &&
> + trace_fetch client origin to_fetch &&
>   have_sent c5 c4^ c2side &&
>   have_not_sent c4 c4^^ c4^^^
>  '
> @@ -121,7 +134,7 @@ test_expect_success 'handle clock skew' '
>   # and sent, because (due to clock skew) its only parent has already been
>   # popped off the priority queue.
>   test_config -C client fetch.negotiationalgorithm skipping &&
> -

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

Jeff King  writes:

> Here it is as a patch on top of jt/fetch-negotiator-skipping, which lets
> us pursue any fix for interleaved trace output on Windows without the
> pressure of an impending flaky test.
>
> My gut says that looking into making O_APPEND work there is going to be
> the nicest solution, but my gut is not very well versed in Windows
> subtleties. ;)

As we know these tests are only interested in "fetch" lines and not
record of communication in the other direction, I think this is a
nice clean-up that is worth doing.

For only two grep's it may not look worth doing, but it would be
also a good clean-up to give an got_acked helper that sits next to
have_sent and have_not_sent helpers for readability.  That is of
course outside the scope of this change.

Will queue.  Thansk.

> -- >8 --
> Subject: [PATCH] t5552: suppress upload-pack trace output
>
> The t5552 test script uses GIT_TRACE_PACKET to monitor what
> git-fetch sends and receives. However, because we're
> accessing a local repository, the child upload-pack also
> sends trace output to the same file.
>
> On Linux, this works out OK. We open the trace file with
> O_APPEND, so all writes are atomically positioned at the end
> of the file. No data can be overwritten or omitted. And
> since we prepare our small writes in a strbuf and write them
> with a single write(), we should see each line as an atomic
> unit. The order of lines between the two processes is
> undefined, but the test script greps only for "fetch>" or
> "fetch<" lines. So under Linux, the test results are
> deterministic.
>
> The test fails intermittently on Windows, however,
> reportedly even overwriting bits of the output file (i.e.,
> O_APPEND does not seem to give us an atomic position+write).
>
> Since the test only cares about the trace output from fetch,
> we can just disable the output from upload-pack. That
> doesn't solve the greater question of O_APPEND/trace issues
> under Windows, but it easily fixes the flakiness from this
> test.
>
> Reported-by: Johannes Schindelin 
> Signed-off-by: Jeff King 
> ---
> I'm assuming that this really isn't triggerable on Linux. I tried and
> couldn't manage to get it to fail, and the reasoning above explains why.
> But I wasn't 100% clear that Dscho hadn't seen it fail on non-Windows.
>
>  t/t5552-skipping-fetch-negotiator.sh | 23 ++-
>  1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/t/t5552-skipping-fetch-negotiator.sh 
> b/t/t5552-skipping-fetch-negotiator.sh
> index 0a8e0e42ed..0ad50dd839 100755
> --- a/t/t5552-skipping-fetch-negotiator.sh
> +++ b/t/t5552-skipping-fetch-negotiator.sh
> @@ -28,6 +28,19 @@ have_not_sent () {
>   done
>  }
>  
> +# trace_fetch   [args]
> +#
> +# Trace the packet output of fetch, but make sure we disable the variable
> +# in the child upload-pack, so we don't combine the results in the same file.
> +trace_fetch () {
> + client=$1; shift
> + server=$1; shift
> + GIT_TRACE_PACKET="$(pwd)/trace" \
> + git -C "$client" fetch \
> +   --upload-pack 'unset GIT_TRACE_PACKET; git-upload-pack' \
> +   "$server" "$@"
> +}
> +
>  test_expect_success 'commits with no parents are sent regardless of skip 
> distance' '
>   git init server &&
>   test_commit -C server to_fetch &&
> @@ -42,7 +55,7 @@ test_expect_success 'commits with no parents are sent 
> regardless of skip distanc
>   # "c1" has no parent, it is still sent as "have" even though it would
>   # normally be skipped.
>   test_config -C client fetch.negotiationalgorithm skipping &&
> - GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
> + trace_fetch client "$(pwd)/server" &&
>   have_sent c7 c5 c2 c1 &&
>   have_not_sent c6 c4 c3
>  '
> @@ -65,7 +78,7 @@ test_expect_success 'when two skips collide, favor the 
> larger one' '
>   # the next "have" sent will be "c1" (from "c6" skip 4) and not "c4"
>   # (from "c5side" skip 1).
>   test_config -C client fetch.negotiationalgorithm skipping &&
> - GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
> + trace_fetch client "$(pwd)/server" &&
>   have_sent c5side c11 c9 c6 c1 &&
>   have_not_sent c10 c8 c7 c5 c4 c3 c2
>  '
> @@ -91,7 +104,7 @@ test_expect_success 'use ref advertisement to filter out 
> commits' '
>   # not need to send any ancestors of "c3", but we still need to send "c3"
>   # itself.
>   test_config -C client fetch.negotiationalgorithm skipping &&
> - GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch origin to_fetch &&
> + trace_fetch client origin to_fetch &&
>   have_sent c5 c4^ c2side &&
>   have_not_sent c4 c4^^ c4^^^
>  '
> @@ -121,7 +134,7 @@ test_expect_success 'handle clock skew' '
>   # and sent, because (due to clock skew) its only parent has already been
>   # popped off the priority queue.
>   test_config -C client fetch.negotiationalgorithm skipping &&
> -

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

2018-08-10 Thread Jeff King

On Thu, Aug 09, 2018 at 01:49:52PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Are you sure that it's not well-defined? We open the path with O_APPEND,
> > which means every write() will be atomically positioned at the end of
> > file. So we would never lose or overwrite data.
> >
> > We do our own buffering in a strbuf, writing the result out in a single
> > write() call (modulo the OS returning a short write, but that should not
> > generally happen when writing short strings to a file). So we should get
> > individual trace lines as atomic units.
> >
> > The order of lines from the two processes is undefined, of course.
> 
> Correct.  But I am more worried about the "mixed/overwriting"
> breakage, if there is one; it means we may need to be prepared for
> systems that lack O_APPEND that works correctly.  I initially just
> assumed that it was what Dscho was seeing, but after re-reading his
> message, I am not sure anymore.
> 
> I think the "do not trace the other side" approach you suggest for
> these tests that only care about one side is more appropriate
> solution for this particular case.  We then do not have to worry
> about overwriting or output from both sides mixed randomly.

Here it is as a patch on top of jt/fetch-negotiator-skipping, which lets
us pursue any fix for interleaved trace output on Windows without the
pressure of an impending flaky test.

My gut says that looking into making O_APPEND work there is going to be
the nicest solution, but my gut is not very well versed in Windows
subtleties. ;)

-- >8 --
Subject: [PATCH] t5552: suppress upload-pack trace output

The t5552 test script uses GIT_TRACE_PACKET to monitor what
git-fetch sends and receives. However, because we're
accessing a local repository, the child upload-pack also
sends trace output to the same file.

On Linux, this works out OK. We open the trace file with
O_APPEND, so all writes are atomically positioned at the end
of the file. No data can be overwritten or omitted. And
since we prepare our small writes in a strbuf and write them
with a single write(), we should see each line as an atomic
unit. The order of lines between the two processes is
undefined, but the test script greps only for "fetch>" or
"fetch<" lines. So under Linux, the test results are
deterministic.

The test fails intermittently on Windows, however,
reportedly even overwriting bits of the output file (i.e.,
O_APPEND does not seem to give us an atomic position+write).

Since the test only cares about the trace output from fetch,
we can just disable the output from upload-pack. That
doesn't solve the greater question of O_APPEND/trace issues
under Windows, but it easily fixes the flakiness from this
test.

Reported-by: Johannes Schindelin 
Signed-off-by: Jeff King 
---
I'm assuming that this really isn't triggerable on Linux. I tried and
couldn't manage to get it to fail, and the reasoning above explains why.
But I wasn't 100% clear that Dscho hadn't seen it fail on non-Windows.

 t/t5552-skipping-fetch-negotiator.sh | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/t/t5552-skipping-fetch-negotiator.sh 
b/t/t5552-skipping-fetch-negotiator.sh
index 0a8e0e42ed..0ad50dd839 100755
--- a/t/t5552-skipping-fetch-negotiator.sh
+++ b/t/t5552-skipping-fetch-negotiator.sh
@@ -28,6 +28,19 @@ have_not_sent () {
done
 }
 
+# trace_fetch   [args]
+#
+# Trace the packet output of fetch, but make sure we disable the variable
+# in the child upload-pack, so we don't combine the results in the same file.
+trace_fetch () {
+   client=$1; shift
+   server=$1; shift
+   GIT_TRACE_PACKET="$(pwd)/trace" \
+   git -C "$client" fetch \
+ --upload-pack 'unset GIT_TRACE_PACKET; git-upload-pack' \
+ "$server" "$@"
+}
+
 test_expect_success 'commits with no parents are sent regardless of skip 
distance' '
git init server &&
test_commit -C server to_fetch &&
@@ -42,7 +55,7 @@ test_expect_success 'commits with no parents are sent 
regardless of skip distanc
# "c1" has no parent, it is still sent as "have" even though it would
# normally be skipped.
test_config -C client fetch.negotiationalgorithm skipping &&
-   GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
+   trace_fetch client "$(pwd)/server" &&
have_sent c7 c5 c2 c1 &&
have_not_sent c6 c4 c3
 '
@@ -65,7 +78,7 @@ test_expect_success 'when two skips collide, favor the larger 
one' '
# the next "have" sent will be "c1" (from "c6" skip 4) and not "c4"
# (from "c5side" skip 1).
test_config -C client fetch.negotiationalgorithm skipping &&
-   GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
+   trace_fetch client "$(pwd)/server" &&
have_sent c5side c11 c9 c6 c1 &&
have_not_sent c10 c8 c7 c5 c4 c3 c2
 '
@@ -91,7 +104,7 @@ test_expect_success 'use ref advertisement to filter out

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

2018-08-09 Thread Jeff King

On Thu, Aug 09, 2018 at 02:08:56PM -0700, Junio C Hamano wrote:

> > Correct.  But I am more worried about the "mixed/overwriting"
> > breakage, if there is one; it means we may need to be prepared for
> > systems that lack O_APPEND that works correctly.  I initially just
> > assumed that it was what Dscho was seeing, but after re-reading his
> > message, I am not sure anymore.
> >
> > I think the "do not trace the other side" approach you suggest for
> > these tests that only care about one side is more appropriate
> > solution for this particular case.  We then do not have to worry
> > about overwriting or output from both sides mixed randomly.
> 
> A concluding sentence I forgot to add, after saying "this is simpler
> and better to fix test breakage", was
> 
>   But if we really are seeing O_APPEND breakage, a mandatory
>   locking mechanism like this one might be necessary to work
>   around it (I seriously hope we do not have to, though).
> 
> Sorry for an additional noise.

Yeah, my earlier email said "if we just care about this test, then...".

But if we do care about making this multiple-tracing case work all the
time, then we'd need some solution. If there's something that can work
like O_APPEND, even if we have to make some system-specific call, I'd
prefer that to locking.

I did some searching for atomic append on Windows and got mixed reports
on whether their FILE_DATA_APPEND does what we want or not. Knowing
nothing about msys, I'd _assume_ that O_APPEND would get translated to
that flag, but I got lost trying to find it in
git-for-windows/msys2-runtime.

Wouldn't it be wonderful if the solution were as simple as making sure
that flag was plumbed through? I seriously doubt it will be that easy,
though. :)

-Peff

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

2018-08-09 Thread Junio C Hamano

Junio C Hamano  writes:

> Jeff King  writes:
>
>> Are you sure that it's not well-defined? We open the path with O_APPEND,
>> which means every write() will be atomically positioned at the end of
>> file. So we would never lose or overwrite data.
>>
>> We do our own buffering in a strbuf, writing the result out in a single
>> write() call (modulo the OS returning a short write, but that should not
>> generally happen when writing short strings to a file). So we should get
>> individual trace lines as atomic units.
>>
>> The order of lines from the two processes is undefined, of course.
>
> Correct.  But I am more worried about the "mixed/overwriting"
> breakage, if there is one; it means we may need to be prepared for
> systems that lack O_APPEND that works correctly.  I initially just
> assumed that it was what Dscho was seeing, but after re-reading his
> message, I am not sure anymore.
>
> I think the "do not trace the other side" approach you suggest for
> these tests that only care about one side is more appropriate
> solution for this particular case.  We then do not have to worry
> about overwriting or output from both sides mixed randomly.

A concluding sentence I forgot to add, after saying "this is simpler
and better to fix test breakage", was

But if we really are seeing O_APPEND breakage, a mandatory
locking mechanism like this one might be necessary to work
around it (I seriously hope we do not have to, though).

Sorry for an additional noise.

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

2018-08-09 Thread Junio C Hamano

Jeff King  writes:

> Are you sure that it's not well-defined? We open the path with O_APPEND,
> which means every write() will be atomically positioned at the end of
> file. So we would never lose or overwrite data.
>
> We do our own buffering in a strbuf, writing the result out in a single
> write() call (modulo the OS returning a short write, but that should not
> generally happen when writing short strings to a file). So we should get
> individual trace lines as atomic units.
>
> The order of lines from the two processes is undefined, of course.

Correct.  But I am more worried about the "mixed/overwriting"
breakage, if there is one; it means we may need to be prepared for
systems that lack O_APPEND that works correctly.  I initially just
assumed that it was what Dscho was seeing, but after re-reading his
message, I am not sure anymore.

I think the "do not trace the other side" approach you suggest for
these tests that only care about one side is more appropriate
solution for this particular case.  We then do not have to worry
about overwriting or output from both sides mixed randomly.

> diff --git a/t/t5552-skipping-fetch-negotiator.sh 
> b/t/t5552-skipping-fetch-negotiator.sh
> index 3b60bd44e0..5ad5bece55 100755
> --- a/t/t5552-skipping-fetch-negotiator.sh
> +++ b/t/t5552-skipping-fetch-negotiator.sh
> @@ -28,6 +28,19 @@ have_not_sent () {
>   done
>  }
>  
> +# trace_fetch   [args]
> +#
> +# Trace the packet output of fetch, but make sure we disable the variable
> +# in the child upload-pack, so we don't combine the results in the same file.
> +trace_fetch () {
> + client=$1; shift
> + server=$1; shift
> + GIT_TRACE_PACKET="$(pwd)/trace" \
> + git -C "$client" fetch \
> +   --upload-pack 'unset GIT_TRACE_PACKET; git-upload-pack' \
> +   "$server" "$@"
> +}
> +
>  test_expect_success 'commits with no parents are sent regardless of skip 
> distance' '
>   git init server &&
>   test_commit -C server to_fetch &&
> @@ -42,7 +55,7 @@ test_expect_success 'commits with no parents are sent 
> regardless of skip distanc
>   # "c1" has no parent, it is still sent as "have" even though it would
>   # normally be skipped.
>   test_config -C client fetch.negotiationalgorithm skipping &&
> - GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
> + trace_fetch client "$(pwd)/server" &&
>   have_sent c7 c5 c2 c1 &&
>   have_not_sent c6 c4 c3
>  '
> @@ -88,7 +101,7 @@ test_expect_success 'when two skips collide, favor the 
> larger one' '
>   # the next "have" sent will be "c1" (from "c6" skip 4) and not "c4"
>   # (from "c5side" skip 1).
>   test_config -C client fetch.negotiationalgorithm skipping &&
> - GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
> + trace_fetch client "$(pwd)/server" &&
>   have_sent c5side c11 c9 c6 c1 &&
>   have_not_sent c10 c8 c7 c5 c4 c3 c2
>  '
> @@ -114,7 +127,7 @@ test_expect_success 'use ref advertisement to filter out 
> commits' '
>   # not need to send any ancestors of "c3", but we still need to send "c3"
>   # itself.
>   test_config -C client fetch.negotiationalgorithm skipping &&
> - GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch origin to_fetch &&
> + trace_fetch client origin to_fetch &&
>   have_sent c5 c4^ c2side &&
>   have_not_sent c4 c4^^ c4^^^
>  '
> @@ -144,7 +157,7 @@ test_expect_success 'handle clock skew' '
>   # and sent, because (due to clock skew) its only parent has already been
>   # popped off the priority queue.
>   test_config -C client fetch.negotiationalgorithm skipping &&
> - GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
> + trace_fetch client "$(pwd)/server" &&
>   have_sent c2 c1 old4 old2 old1 &&
>   have_not_sent old3
>  '
> @@ -176,7 +189,7 @@ test_expect_success 'do not send "have" with ancestors of 
> commits that server AC
>   test_commit -C server commit-on-b1 &&
>  
>   test_config -C client fetch.negotiationalgorithm skipping &&
> - GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" 
> to_fetch &&
> + trace_fetch client "$(pwd)/server" to_fetch &&
>   grep "  fetch" trace &&
>  
>   # fetch-pack sends 2 requests each containing 16 "have" lines before

Re: [PATCH 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

2018-08-09 Thread Jeff King

On Thu, Aug 09, 2018 at 10:35:24AM -0700, Johannes Schindelin via GitGitGadget 
wrote:

> The culprit is that two processes try simultaneously to write to the same
> file specified via GIT_TRACE_PACKET, and it is not well defined how that
> should work, even on Linux.

Are you sure that it's not well-defined? We open the path with O_APPEND,
which means every write() will be atomically positioned at the end of
file. So we would never lose or overwrite data.

We do our own buffering in a strbuf, writing the result out in a single
write() call (modulo the OS returning a short write, but that should not
generally happen when writing short strings to a file). So we should get
individual trace lines as atomic units.

The order of lines from the two processes is undefined, of course.

> This patch series addresses that by locking the trace fd. I chose to use 
> flock() instead of fcntl() because the Win32 API LockFileEx()
> [https://docs.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-lockfileex]
>  
> (which does exactly what I want in this context) has much more similar
> semantics to the former than the latter.

I must admit I'm not excited to see us adding extra locking and
complexity when it is not necessarily needed. Is this a feature we
actually care about delivering for normal users of GIT_TRACE, or do we
just care about solving the test flakiness here?

Because we should be able to do the latter with the patch below.

I also wonder if we should be clearing GIT_TRACE as part of clearing
repo-env. It would do what we want automatically in this case, though
there are definitely cases where I prefer the current behavior (because
I really do want to see the upload-pack side of it). Of course I could
always use "--upload-pack 'GIT_TRACE_PACKET=... upload-pack'", so this
is really just a default.

diff --git a/t/t5552-skipping-fetch-negotiator.sh 
b/t/t5552-skipping-fetch-negotiator.sh
index 3b60bd44e0..5ad5bece55 100755
--- a/t/t5552-skipping-fetch-negotiator.sh
+++ b/t/t5552-skipping-fetch-negotiator.sh
@@ -28,6 +28,19 @@ have_not_sent () {
done
 }
 
+# trace_fetch   [args]
+#
+# Trace the packet output of fetch, but make sure we disable the variable
+# in the child upload-pack, so we don't combine the results in the same file.
+trace_fetch () {
+   client=$1; shift
+   server=$1; shift
+   GIT_TRACE_PACKET="$(pwd)/trace" \
+   git -C "$client" fetch \
+ --upload-pack 'unset GIT_TRACE_PACKET; git-upload-pack' \
+ "$server" "$@"
+}
+
 test_expect_success 'commits with no parents are sent regardless of skip 
distance' '
git init server &&
test_commit -C server to_fetch &&
@@ -42,7 +55,7 @@ test_expect_success 'commits with no parents are sent 
regardless of skip distanc
# "c1" has no parent, it is still sent as "have" even though it would
# normally be skipped.
test_config -C client fetch.negotiationalgorithm skipping &&
-   GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
+   trace_fetch client "$(pwd)/server" &&
have_sent c7 c5 c2 c1 &&
have_not_sent c6 c4 c3
 '
@@ -88,7 +101,7 @@ test_expect_success 'when two skips collide, favor the 
larger one' '
# the next "have" sent will be "c1" (from "c6" skip 4) and not "c4"
# (from "c5side" skip 1).
test_config -C client fetch.negotiationalgorithm skipping &&
-   GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
+   trace_fetch client "$(pwd)/server" &&
have_sent c5side c11 c9 c6 c1 &&
have_not_sent c10 c8 c7 c5 c4 c3 c2
 '
@@ -114,7 +127,7 @@ test_expect_success 'use ref advertisement to filter out 
commits' '
# not need to send any ancestors of "c3", but we still need to send "c3"
# itself.
test_config -C client fetch.negotiationalgorithm skipping &&
-   GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch origin to_fetch &&
+   trace_fetch client origin to_fetch &&
have_sent c5 c4^ c2side &&
have_not_sent c4 c4^^ c4^^^
 '
@@ -144,7 +157,7 @@ test_expect_success 'handle clock skew' '
# and sent, because (due to clock skew) its only parent has already been
# popped off the priority queue.
test_config -C client fetch.negotiationalgorithm skipping &&
-   GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
+   trace_fetch client "$(pwd)/server" &&
have_sent c2 c1 old4 old2 old1 &&
have_not_sent old3
 '
@@ -176,7 +189,7 @@ test_expect_success 'do not send "have" with ancestors of 
commits that server AC
test_commit -C server commit-on-b1 &&
 
test_config -C client fetch.negotiationalgorithm skipping &&
-   GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" 
to_fetch &&
+   trace_fetch client "$(pwd)/server" to_fetch &&
grep "  fetch" trace &&
 
# fetch-pack sends 2 requests each containing 16 "have"

Re: [PATCH 0/4] line-log: be more careful when adjusting multiple line ranges

2018-08-05 Thread Eric Sunshine

On Sat, Aug 4, 2018 at 6:18 PM Johannes Schindelin via GitGitGadget
 wrote:
> Now, I am fairly certain that the changes are correct, but given my track
> record with off-by-one bugs (and once even an off-by-two bug), I would
> really appreciate some thorough review of this code, in particular the
> second one that is the actual bug fix. I am specifically interested in
> reviews from people who know line-log.c pretty well and can tell me whether
> the src[i].end > target[j].end is correct, or whether it should actually
> have been a >= (I tried to wrap my head around this, but I would feel more
> comfortable if a domain expert would analyze this, whistling, and looking
> Eric's way).

Although I fixed a number of bugs in basic range-set manipulation code
(5 years ago), I have no experience with or knowledge of the code
actually dealing with range-set manipulation in relation to diffs, so
I'm not a domain expert by any means, and I'm _still_ trying to wrap
my head around that code.

That said, I left some comments on the individual patches suggesting a
simpler and more correct fix for the crash. (Despite that suggestion
for fixing the crash, I still can't say whether the actual
computations performed by the existing code are correct, since I still
don't understand that aspect of it.)

Re: [PATCH 0/4] Use oid_object_info() instead of read_object_file()

2018-07-18 Thread Junio C Hamano

Johannes Schindelin  writes:

> Hi Olga,
>
> On Fri, 13 Jul 2018, Оля Тележная wrote:
>
>> 2018-07-09 11:27 GMT+03:00 Оля Тележная :
>> > Hello everyone,
>> > This is my new attempt to start using oid_object_info_extended() in
>> > ref-filter. You could look at previous one [1] [2] but it is not
>> > necessary.
>> >
>> > The goal (still) is to improve performance by avoiding calling expensive
>> > functions when we don't need the information they provide
>> > or when we could get it by using a cheaper function.
>> >
>> > This patch is a middle step. In the end, I want to add new atoms
>> > ("objectsize:disk" and "deltabase") and reuse ref-filter logic in
>> > cat-file command.
>> >
>> > I also know about problems with memory leaks in ref-filter: that would
>> > be my next task that I will work on. Since I did not generate any new
>> > leaks in this patch (just use existing ones), I decided to put this
>> > part on a review and fix leaks as a separate task.
>> 
>> UPDATES since v1:
>> add init to eaten variable (thanks to Szeder Gabor, Johannes Schindelin)
>> improve second commit message (thanks to Junio C Hamano)
>> add static keyword (thanks to Ramsay Jones)
>> 
>> >
>> > Thank you!
>> >
>> > [1] https://github.com/git/git/pull/493
>
> Could you please populate the description of that PR so that SubmitGit
> picks it up as cover letter?

Thanks for suggesting that.  Yes, an updated version of a series,
even if it is a small one with just 4 or 5 patches, becomes much
easier to read with a well-written cover letter.

Re: [PATCH 0/4] Use oid_object_info() instead of read_object_file()

2018-07-18 Thread Johannes Schindelin

Hi Olga,

On Fri, 13 Jul 2018, Оля Тележная wrote:

> 2018-07-09 11:27 GMT+03:00 Оля Тележная :
> > Hello everyone,
> > This is my new attempt to start using oid_object_info_extended() in
> > ref-filter. You could look at previous one [1] [2] but it is not
> > necessary.
> >
> > The goal (still) is to improve performance by avoiding calling expensive
> > functions when we don't need the information they provide
> > or when we could get it by using a cheaper function.
> >
> > This patch is a middle step. In the end, I want to add new atoms
> > ("objectsize:disk" and "deltabase") and reuse ref-filter logic in
> > cat-file command.
> >
> > I also know about problems with memory leaks in ref-filter: that would
> > be my next task that I will work on. Since I did not generate any new
> > leaks in this patch (just use existing ones), I decided to put this
> > part on a review and fix leaks as a separate task.
> 
> UPDATES since v1:
> add init to eaten variable (thanks to Szeder Gabor, Johannes Schindelin)
> improve second commit message (thanks to Junio C Hamano)
> add static keyword (thanks to Ramsay Jones)
> 
> >
> > Thank you!
> >
> > [1] https://github.com/git/git/pull/493

Could you please populate the description of that PR so that SubmitGit
picks it up as cover letter?

Thanks,
Johannes

> > [2] 
> > https://public-inbox.org/git/010201637254c969-a346030e-0b75-41ad-8ef3-2ac7e04ba4fb-000...@eu-west-1.amazonses.com/
>

Re: [PATCH 0/4] Use oid_object_info() instead of read_object_file()

2018-07-13 Thread Оля Тележная

2018-07-09 11:27 GMT+03:00 Оля Тележная :
> Hello everyone,
> This is my new attempt to start using oid_object_info_extended() in
> ref-filter. You could look at previous one [1] [2] but it is not
> necessary.
>
> The goal (still) is to improve performance by avoiding calling expensive
> functions when we don't need the information they provide
> or when we could get it by using a cheaper function.
>
> This patch is a middle step. In the end, I want to add new atoms
> ("objectsize:disk" and "deltabase") and reuse ref-filter logic in
> cat-file command.
>
> I also know about problems with memory leaks in ref-filter: that would
> be my next task that I will work on. Since I did not generate any new
> leaks in this patch (just use existing ones), I decided to put this
> part on a review and fix leaks as a separate task.

UPDATES since v1:
add init to eaten variable (thanks to Szeder Gabor, Johannes Schindelin)
improve second commit message (thanks to Junio C Hamano)
add static keyword (thanks to Ramsay Jones)

>
> Thank you!
>
> [1] https://github.com/git/git/pull/493
> [2] 
> https://public-inbox.org/git/010201637254c969-a346030e-0b75-41ad-8ef3-2ac7e04ba4fb-000...@eu-west-1.amazonses.com/

Re: [PATCH 0/4] Use oid_object_info() instead of read_object_file()

2018-07-10 Thread Johannes Schindelin

Hi Olga,

On Mon, 9 Jul 2018, Оля Тележная wrote:

> [2]
> https://public-inbox.org/git/010201637254c969-a346030e-0b75-41ad-8ef3-2ac7e04ba4fb-000...@eu-west-1.amazonses.com/

This type of Message-Id makes me think that you used SubmitGit to send
this patch series.

The main problem I see here is that the patches are not sent as replies to
this cover letter, and therefore they are seemingly disconnected on the
mailing list.

It was also my impression that SubmitGit started supporting sending cover
letters, in which case you would not have to jump through hoops to thread
the mails properly. But for that to work, the PR has to have a description
which is then used as cover letter. I do not see any description in
https://github.com/git/git/pull/520, though. Maybe provide one?

Ciao,
Johannes

P.S.: You might have noticed that I am working (slowly, but steadily) on a
contender for SubmitGit that I call GitGitGadget. Originally, I really
wanted to enhance SubmitGit instead because I am a big believer of *not*
reinventing the wheel (so much energy gets wasted that way).

However, in this case the limitations of the chosen language (I do not
want to learn Scala, I have absolutely zero need to know Scala in any of
my other endeavors, and my time to learn new things is limited, so I spend
it wisely) and the limitations of the design (the UI is completely
separate from GitHub, you have to allow Amazon to send mails in your name,
and SubmitGit's design makes it impossible to work bi-directionally, it is
only GitHub -> mailing list, while I also want the option to add replies
on the mailing list as comments to the GitHub PR in the future) made me
reconsider.

If you want to kick the tires, so to say, I welcome you to give
GitGitGadget a try. It would require only a couple of things from you:

- You would have to settle for a branch name, and then not open new PRs
for every iteration you want to send, but force-push the branch instead.

- You would have to open a PR at https://github.com/gitgitgadget/git.

- You would have to provide the cover letter via the PR's description (and
update that description before sending newer iterations).

- I would have to add you to the list of users allowed to send patches via
GitGitGadget (GitGitGadget has some really light-weight access control
to prevent spamming).

- You would then send a new iteration by simply adding a comment to your
PR that contains this command: /submit

- To integrate well with previous patch series iterations (i.e. to connect
the threads), I would have to come up with a little bit of tooling to
add some metadata that I have to reconstruct manually from your
previously-sent iterations.

Re: [PATCH 0/4] Use oid_object_info() instead of read_object_file()

2018-07-09 Thread Junio C Hamano

Оля Тележная   writes:

> Hello everyone,
> This is my new attempt to start using oid_object_info_extended() in
> ref-filter. You could look at previous one [1] [2] but it is not
> necessary.

Yup, it sounds like a sensible thing to do to try asking object-info
helper instead of reading the whole object in-core and inspecting it
ourselves when we can avoid it.

Re: [PATCH 0/4] fsck: doc fixes & fetch.fsck.* implementation

2018-05-28 Thread Junio C Hamano

Ævar Arnfjörð Bjarmason   writes:

> On Thu, May 24, 2018 at 9:02 PM, Jeff King  wrote:
>> On Thu, May 24, 2018 at 07:04:04PM +0200, Ævar Arnfjörð Bjarmason wrote:
>>
>>> That doesn't work, because that's for the server-side, but I need the
>>> fetch.fsck.* that doesn't exist. This works (I'll send a better patch
>>> with tests / docs etc. soon):
>>
>> Yeah, I think this is the right direction.

It seems that this 4-patch series is sufficiently commented and
earlier parts can be further polished soonish using help made by
others.  I need to re-read the last patch (implementation) but I
think this generally is good.

Thanks.

Re: [PATCH 0/4] a few mark_parents_uninteresting cleanups

2018-05-14 Thread Derrick Stolee


On 5/11/2018 2:00 PM, Jeff King wrote:

This is a follow-up to the discussion from February:

   
https://public-inbox.org/git/1519522496-73090-1-git-send-email-dsto...@microsoft.com/

There I theorized that some of these extra has_object_file() checks were
unnecessary. After poking around a bit, I've convinced myself that this
is the case, so here are some patches.

After Stolee's fix in ebbed3ba04 (revision.c: reduce object database
queries, 2018-02-24), I doubt this will provide any noticeable speedup.
IMHO the value is just in simplifying the logic.

The first two patches are the actual has_object_file simplifications.
The second two are an attempt to fix some head-scratching I had while
reading mark_parents_uninteresting(). I hope the result is easier to
follow, but I may have just shuffled one confusion for another. :)

   [1/4]: mark_tree_contents_uninteresting(): drop missing object check
   [2/4]: mark_parents_uninteresting(): drop missing object check
   [3/4]: mark_parents_uninteresting(): replace list with stack
   [4/4]: mark_parents_uninteresting(): avoid most allocation

  revision.c | 90 ++
  1 file changed, 50 insertions(+), 40 deletions(-)

-Peff


This series looks good to me. I found Patch 3 hard to read in the diff, 
so I just looked at the final result and the new arrangement is very 
clear about how it should behave.


Reviewed-by: Derrick Stolee

Re: [PATCH 0/4] doc: cleaning up instances of \--

2018-05-10 Thread Jeff King

On Wed, Apr 18, 2018 at 01:24:55PM +0900, Junio C Hamano wrote:

> Martin Ågren  writes:
> 
> > This is a patch series to convert \-- to -- in our documentation. The
> > first patch is a reiteration of 1c262bb7b2 (doc: convert \--option to
> > --option, 2015-05-13) to fix some instances that have appeared since.
> > The other three patches deal with standalone "\--" which we can't
> > always turn into "--" since it can be rendered as an em dash.
> 
> All looked sensible.  As you mentioned in [2/4], "\--::" that is
> part of an enumulation appear in documentation for about a dozen
> commands after the series, but I do not think we can avoid it.
> 
> One thing that makes me wonder related to these patches is if a
> newer AsciiDoc we assume lets us do without {litdd} macro.  This
> series and our earlier effort like 1c262bb7 ("doc: convert \--option
> to --option", 2015-05-13) mentions that "\--word" is less pleasant
> on the eyes than "--word", but the ugliness "two{litdd}words" has
> over "two--words" is far worse than that, so...

I think many cases that use {litdd} would be better off using literal
backticks anyway (e.g., git-add.txt mentions the filename
`git-add--interactive.perl`).

There are certainly a few that can't, though (e.g., config.txt uses
linkgit:git-web{litdd}browse[1]).  I agree that "\--" is less ugly there
(and seems to work on my modern asciidoc). There's some history on the
litdd versus "\--" choice in 565e135a1e (Documentation: quote
double-dash for AsciiDoc, 2011-06-29). That in turn references the
2839478774 (Work around em-dash handling in newer AsciiDoc, 2010-08-23),
but I wouldn't be surprised if all of that is now obsolete with our
AsciiDoc 8+ requirement.

-Peff

PS Late review, I know, but the patches look good to me. :)

Re: [PATCH 0/4] subtree: move out of contrib

2018-05-01 Thread Ævar Arnfjörð Bjarmason

On Tue, May 01 2018, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Mon, 30 Apr 2018, Ævar Arnfjörð Bjarmason wrote:
>
>> I think at this point git-subtree is widely used enough to move out of
>> contrib/, maybe others disagree, but patches are always better for
>> discussion that patch-less ML posts.
>
> Sure, it is used widely enough.
>
> However, it flies in the face of so many GSoC efforts to introduce yet
> another one of those poorly portable Unix shell scripts, as central part
> of Git's code base.
>
> The script itself does look quite straight-forward to port to a builtin,
> so why not give it a try?

That's a valid point. I think it makes sense to leave that aside for
now, maybe the consensus is that subtree is fine in every way except
we'd like to have a policy not to introduce new shellscript built-ins.

Let's first just assume it's in C already and look at it in terms of its
functionality, to figure out if it's worth even getting to that point.

> If you are completely opposed to porting it to C, I will be completely
> opposed to moving it out of contrib/.

This series shows that we should split the concern about whether
something lives in contrib/ from whether it's built/installed by
default.

No matter if we decide that subtree should be a blessed default command
it makes sense to move it out of contrib, purely because as can be seen
from this series it'll replace >100 lines of hacks with 1 line in our
main Makefile.

We can then just e.g. add a flag to guard for it,
e.g. CONTRIB_SUBTREE=YesPlease.

But that's just an internal implementation detail of how we manage code
sitting in git.git.

Re: [PATCH 0/4] subtree: move out of contrib

2018-05-01 Thread Johannes Schindelin

Hi Ævar,

On Mon, 30 Apr 2018, Ævar Arnfjörð Bjarmason wrote:

> I think at this point git-subtree is widely used enough to move out of
> contrib/, maybe others disagree, but patches are always better for
> discussion that patch-less ML posts.

Sure, it is used widely enough.

However, it flies in the face of so many GSoC efforts to introduce yet
another one of those poorly portable Unix shell scripts, as central part
of Git's code base.

The script itself does look quite straight-forward to port to a builtin,
so why not give it a try?

If you are completely opposed to porting it to C, I will be completely
opposed to moving it out of contrib/.

If you need help with porting it, please come up with a task plan and I
can jump in, to help (but please do collaborate with me on this one, don't
leave all of the hard work to me).

Ciao,
Dscho

Re: [PATCH 0/4] subtree: move out of contrib

2018-05-01 Thread Duy Nguyen

On Mon, Apr 30, 2018 at 11:50 AM, Ævar Arnfjörð Bjarmason
 wrote:
> I think at this point git-subtree is widely used enough to move out of
> contrib/, maybe others disagree, but patches are always better for
> discussion that patch-less ML posts.

After narrow/partial clone becomes real, it should be "easy" to
implement some sort of narrow checkout that would achieve the same
thing. But it took me forever with all other stuff to get back to
this.

If we remove it from contrib and there are people willing to
update/maintain it, should it be a separate repository then? The
willing people will have much more freedom to update it. And I don't
have to answer the questions about who will maintain this thing in
git.git. I don't like the idea of adding new (official) shell-based
commands either to be honest.
-- 
Duy

Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Avery Pennarun

On Mon, Apr 30, 2018 at 6:21 PM, Ævar Arnfjörð Bjarmason
 wrote:
> Pretty clear it's garbage data, unless we're to believe that the
> relative interest of submodules in the US, Germany and Sweden is 51, 64
> & 84, but 75, 100 and 0 for subtree.

Oh yeah, Swedish people hate git-subtree.  Nobody knows why.

Avery

Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Ævar Arnfjörð Bjarmason

On Mon, Apr 30 2018, Avery Pennarun wrote:

> On Mon, Apr 30, 2018 at 5:38 PM, Stefan Beller  wrote:
> There's one exception, which is doing a one-time permanent merge of
> two projects into one.  That's a nice feature, but is probably used
> extremely rarely.

FWIW this is the only thing I've used it for. I do this occasionally and
used to do this manually with format-patch + "perl -pe" before or
similar when I needed to merge some repositories together, and then some
other times I was less stupid and manually started doing something
similar to what subtree is doing with a "move everything" commit just
before the merge of the two histories.

>> https://trends.google.com/trends/explore?date=all=git%20subtree,git%20submodule
>>
>> Not sure what to make of this data.
>
> Clearly people need a lot more help when using submodules than when
> using subtree :)

Pretty clear it's garbage data, unless we're to believe that the
relative interest of submodules in the US, Germany and Sweden is 51, 64
& 84, but 75, 100 and 0 for subtree.

Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Stefan Beller

On Mon, Apr 30, 2018 at 2:53 PM, Avery Pennarun  wrote:

> For the best of both worlds, I've often thought that a good balance
> would be to use the same data structure that submodule uses, but to
> store all the code in a single git repo under different refs, which we
> might or might not download (or might or might not have different
> ACLs) under different circumstances.

There has been some experimentation with having a simpler ref
surface on the submodule side,
https://public-inbox.org/git/cover.1512168087.git.jonathanta...@google.com/

The way you describe the future of submodules, all we'd have to do
is to teach git-clone how to select the the "interesting" refs for your use
case. Any other command would assume all submodule data to be
in the main repository.

The difference to Jonathans proposal linked above, would be the
object store to be in the main repo and the refs to be prefixed
per submodule instead of "shadowed".

>  However, when some projects get
> really huge (lots of very big submodule dependencies), then repacking
> one-big-repo starts becoming unwieldy; in that situation git-subtree
> also fails completely.

Yes, but that is a general scaling problem of Git that could be tackled,
e.g. repack into multiple packs serially instead of putting everything
into one pack.

>> Submodules do not need to produce a synthetic project history
>> when splitting off again, as the history is genuine. This allows
>> for easier work with upstream.
>
> Splitting for easier work upstream is great, and there really ought to
> be an official version of 'git subtree split', which is good for all
> sorts of purposes.
>
> However, I suspect almost all uses of the split feature are a)
> splitting a subtree that you previously merged in, or b) splitting a
> subtree into a separate project that you want to maintain separately
> from now on.  Repeated splits in case (a) are only necessary because
> you're not using submodules, or in case (b) are only necessary because
> you didn't *switch* to submodules when it finally came time to split
> the projects.  (In both cases you probably didn't switch to submodules
> because you didn't like one of its tradeoffs, especially the need to
> track multiple repos when you fork.)

That makes sense.

>
> There's one exception, which is doing a one-time permanent merge of
> two projects into one.  That's a nice feature, but is probably used
> extremely rarely.  More often people get into a
> merge-split-merge-split cycle that would be better served by a
> slightly improved git-submodule.

This rare use case is how git-subtree came into existence in gits
contrib directory AFAICT,
https://kernel.googlesource.com/pub/scm/git/git/+/634392b26275fe5436c0ea131bc89b46476aa4ae
which is interesting to view in git-show, but I think defaults could
be tweaked there, as it currently shows me mostly a license file.

>> Conceptually Gerrit is doing
>>
>>   while true:
>> git submodule update --remote
>> if worktree is dirty:
>> git commit "update the submodules"
>>
>> just that Gerrit doesn't poll but does it event based.
>
> ...and it's super handy :)  The problem is it's fundamentally
> centralized: because gerrit can serialize merges into the submodule,
> it also knows exactly how to update the link in the supermodule.  If
> there was wild branching and merging (as there often is in git) and
> you had to resolve conflicts between two submodules, I don't think it
> would be obvious at all how to do it automatically when pushing a
> submodule.  (This also works quite badly with git subtree --squash.)

With the poll based solution I don't think you'd run into many more
problems than you would with Gerrits solution.

In a nearby thread, we were just discussing the submodule merging
strategies,
https://public-inbox.org/git/1524739599.20251.17.ca...@klsmartin.com/
which might seem confusing, but the implementation is actually easy
as we just fastforward-only in submodules.

>>
>> https://trends.google.com/trends/explore?date=all=git%20subtree,git%20submodule
>>
>> Not sure what to make of this data.
>
> Clearly people need a lot more help when using submodules than when
> using subtree :)

That could be true. :)

Thanks,
Stefan

Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Avery Pennarun

On Mon, Apr 30, 2018 at 5:38 PM, Stefan Beller  wrote:
> On Mon, Apr 30, 2018 at 1:45 PM, Avery Pennarun  wrote:
> No objections from me either.
>
> Submodules seem to serve a slightly different purpose, though?

I think the purpose is actually the same - it's just the tradeoffs
that are difference.  Both sets of tradeoffs kind of suck.

> With Subtrees the superproject always contains all the code,
> even when you squash the subtree histroy when merging it in.
> In the submodule world, you may not have access to one of the
> submodules.

Right.  Personally I think it's a disadvantage of subtree that it
always contains all the code (what if some people don't want the code
for a particular build variant?).  However, it's a huge pain that
submodules *don't* contain all the code (what if I'm not online right
now, or the site supposedly containing the code goes offline, or I
want to make my own fork?).

For the best of both worlds, I've often thought that a good balance
would be to use the same data structure that submodule uses, but to
store all the code in a single git repo under different refs, which we
might or might not download (or might or might not have different
ACLs) under different circumstances.  However, when some projects get
really huge (lots of very big submodule dependencies), then repacking
one-big-repo starts becoming unwieldy; in that situation git-subtree
also fails completely.

> Submodules do not need to produce a synthetic project history
> when splitting off again, as the history is genuine. This allows
> for easier work with upstream.

Splitting for easier work upstream is great, and there really ought to
be an official version of 'git subtree split', which is good for all
sorts of purposes.

However, I suspect almost all uses of the split feature are a)
splitting a subtree that you previously merged in, or b) splitting a
subtree into a separate project that you want to maintain separately
from now on.  Repeated splits in case (a) are only necessary because
you're not using submodules, or in case (b) are only necessary because
you didn't *switch* to submodules when it finally came time to split
the projects.  (In both cases you probably didn't switch to submodules
because you didn't like one of its tradeoffs, especially the need to
track multiple repos when you fork.)

> Subtrees present you the whole history by default and the user
> needs to be explicit about not wanting to see history from the
> subtree, which is the opposite of submodules (though this
> may be planned in the future to switch).

It turns out that AFAIK, almost everyone prefers 'git subtree
--squash', which squashes into a single commit each time you merge,
much like git submodule does.  I doubt people would cry too much if
the full-history feature went away.

There's one exception, which is doing a one-time permanent merge of
two projects into one.  That's a nice feature, but is probably used
extremely rarely.  More often people get into a
merge-split-merge-split cycle that would be better served by a
slightly improved git-submodule.

>> The gerrit team (eg. Stefan Beller) has been doing some really great
>> stuff to make submodules more usable by helping with relative
>> submodule links and by auto-updating links in supermodules at the
>> right times.  Unfortunately doing that requires help from the server
>> side, which kind of messes up decentralization and so doesn't solve
>> the problem in the general case.
>
> Conceptually Gerrit is doing
>
>   while true:
> git submodule update --remote
> if worktree is dirty:
> git commit "update the submodules"
>
> just that Gerrit doesn't poll but does it event based.

...and it's super handy :)  The problem is it's fundamentally
centralized: because gerrit can serialize merges into the submodule,
it also knows exactly how to update the link in the supermodule.  If
there was wild branching and merging (as there often is in git) and
you had to resolve conflicts between two submodules, I don't think it
would be obvious at all how to do it automatically when pushing a
submodule.  (This also works quite badly with git subtree --squash.)

>> I really wish there were a good answer, but I don't know what it is.
>> I do know that lots of people seem to at least be happy using
>> git-subtree, and would be even happier if it were installed
>> automatically with git.
>
> https://trends.google.com/trends/explore?date=all=git%20subtree,git%20submodule
>
> Not sure what to make of this data.

Clearly people need a lot more help when using submodules than when
using subtree :)

Have fun,

Avery

Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Stefan Beller

On Mon, Apr 30, 2018 at 1:45 PM, Avery Pennarun  wrote:
> On Mon, Apr 30, 2018 at 5:50 AM, Ævar Arnfjörð Bjarmason
>  wrote:
>> I think at this point git-subtree is widely used enough to move out of
>> contrib/, maybe others disagree, but patches are always better for
>> discussion that patch-less ML posts.
>
> I really was hoping git-subtree would be completely obsoleted by
> git-submodule by now, but... it hasn't been, so... no objections on
> this end.

No objections from me either.

Submodules seem to serve a slightly different purpose, though?

With Subtrees the superproject always contains all the code,
even when you squash the subtree histroy when merging it in.
In the submodule world, you may not have access to one of the
submodules.

Submodules do not need to produce a synthetic project history
when splitting off again, as the history is genuine. This allows
for easier work with upstream.

Subtrees present you the whole history by default and the user
needs to be explicit about not wanting to see history from the
subtree, which is the opposite of submodules (though this
may be planned in the future to switch).

> The gerrit team (eg. Stefan Beller) has been doing some really great
> stuff to make submodules more usable by helping with relative
> submodule links and by auto-updating links in supermodules at the
> right times.  Unfortunately doing that requires help from the server
> side, which kind of messes up decentralization and so doesn't solve
> the problem in the general case.

Conceptually Gerrit is doing

  while true:
git submodule update --remote
if worktree is dirty:
git commit "update the submodules"

just that Gerrit doesn't poll but does it event based.

> I really wish there were a good answer, but I don't know what it is.
> I do know that lots of people seem to at least be happy using
> git-subtree, and would be even happier if it were installed
> automatically with git.

https://trends.google.com/trends/explore?date=all=git%20subtree,git%20submodule

Not sure what to make of this data.

Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Avery Pennarun

On Mon, Apr 30, 2018 at 5:50 AM, Ævar Arnfjörð Bjarmason
 wrote:
> I think at this point git-subtree is widely used enough to move out of
> contrib/, maybe others disagree, but patches are always better for
> discussion that patch-less ML posts.

I really was hoping git-subtree would be completely obsoleted by
git-submodule by now, but... it hasn't been, so... no objections on
this end.

The gerrit team (eg. Stefan Beller) has been doing some really great
stuff to make submodules more usable by helping with relative
submodule links and by auto-updating links in supermodules at the
right times.  Unfortunately doing that requires help from the server
side, which kind of messes up decentralization and so doesn't solve
the problem in the general case.

I really wish there were a good answer, but I don't know what it is.
I do know that lots of people seem to at least be happy using
git-subtree, and would be even happier if it were installed
automatically with git.

Have fun,

Avery

Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Philip Oakley


From: "Ævar Arnfjörð Bjarmason" 

I think at this point git-subtree is widely used enough to move out of
contrib/, maybe others disagree, but patches are always better for
discussion that patch-less ML posts.



Assuming this lands in Git, then there will also need to be a simple follow 
on into Duy's series that is updating the command-list.txt (Message-Id: 
<20180429181844.21325-10-pclo...@gmail.com>). Duy's series also does the 
completions thing IIUC;-).

--
Philip


Ævar Arnfjörð Bjarmason (4):
 git-subtree: move from contrib/subtree/
 subtree: remove support for git version <1.7
 subtree: fix a test failure under GETTEXT_POISON
 i18n: translate the git-subtree command

.gitignore|   1 +
Documentation/git-submodule.txt   |   2 +-
.../subtree => Documentation}/git-subtree.txt |   3 +
Makefile  |   1 +
contrib/subtree/.gitignore|   7 -
contrib/subtree/COPYING   | 339 --
contrib/subtree/INSTALL   |  28 --
contrib/subtree/Makefile  |  97 -
contrib/subtree/README|   8 -
contrib/subtree/t/Makefile|  86 -
contrib/subtree/todo  |  48 ---
.../subtree/git-subtree.sh => git-subtree.sh  | 109 +++---
{contrib/subtree/t => t}/t7900-subtree.sh |  21 +-
13 files changed, 78 insertions(+), 672 deletions(-)
rename {contrib/subtree => Documentation}/git-subtree.txt (99%)
delete mode 100644 contrib/subtree/.gitignore
delete mode 100644 contrib/subtree/COPYING
delete mode 100644 contrib/subtree/INSTALL
delete mode 100644 contrib/subtree/Makefile
delete mode 100644 contrib/subtree/README
delete mode 100644 contrib/subtree/t/Makefile
delete mode 100644 contrib/subtree/todo
rename contrib/subtree/git-subtree.sh => git-subtree.sh (84%)
rename {contrib/subtree/t => t}/t7900-subtree.sh (99%)

--
2.17.0.290.gded63e768a

Re: [PATCH 0/4] doc: cleaning up instances of \--

2018-04-18 Thread brian m. carlson

On Tue, Apr 17, 2018 at 09:15:25PM +0200, Martin Ågren wrote:
> This is a patch series to convert \-- to -- in our documentation. The
> first patch is a reiteration of 1c262bb7b2 (doc: convert \--option to
> --option, 2015-05-13) to fix some instances that have appeared since.
> The other three patches deal with standalone "\--" which we can't
> always turn into "--" since it can be rendered as an em dash.

This series looked sane to me as well.  I appreciate the work to keep
our documentation working in both AsciiDoc and Asciidoctor.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: [PATCH 0/4] doc: cleaning up instances of \--

2018-04-17 Thread Junio C Hamano

Martin Ågren  writes:

> This is a patch series to convert \-- to -- in our documentation. The
> first patch is a reiteration of 1c262bb7b2 (doc: convert \--option to
> --option, 2015-05-13) to fix some instances that have appeared since.
> The other three patches deal with standalone "\--" which we can't
> always turn into "--" since it can be rendered as an em dash.

All looked sensible.  As you mentioned in [2/4], "\--::" that is
part of an enumulation appear in documentation for about a dozen
commands after the series, but I do not think we can avoid it.

One thing that makes me wonder related to these patches is if a
newer AsciiDoc we assume lets us do without {litdd} macro.  This
series and our earlier effort like 1c262bb7 ("doc: convert \--option
to --option", 2015-05-13) mentions that "\--word" is less pleasant
on the eyes than "--word", but the ugliness "two{litdd}words" has
over "two--words" is far worse than that, so...

Thanks, will queue.

Re: [PATCH 0/4] Convert some stash functionality to a builtin

2018-03-25 Thread Thomas Gummerer

On 03/24, Joel Teichroeb wrote:
> I've been working on converting all of git stash to be a
> builtin, however it's hard to get it all working at once with
> limited time, so I've moved around half of it to a new
> stash--helper builtin and called these functions from the shell
> script. Once this is stabalized, it should be easier to convert
> the rest of the commands one at a time without breaking
> anything.
> 
> I've sent most of this code before, but that was targetting a
> full replacement of stash. The code is overall the same, but
> with some code review changes and updates for internal api
> changes.

Thanks for splitting this up into multiple patches, I found that much
more pleasant to review, and thanks for your continued work on this :)

> Since there seems to be interest from GSOC students who want to
> work on converting builtins, I figured I should finish what I
> have that works now so they could build on top of it.
> 
> Joel Teichroeb (4):
>   stash: convert apply to builtin
>   stash: convert branch to builtin
>   stash: convert drop and clear to builtin
>   stash: convert pop to builtin
> 
>  .gitignore  |   1 +
>  Makefile|   1 +
>  builtin.h   |   1 +
>  builtin/stash--helper.c | 514 
> 
>  git-stash.sh|  13 +-
>  git.c   |   1 +
>  6 files changed, 526 insertions(+), 5 deletions(-)
>  create mode 100644 builtin/stash--helper.c
> 
> -- 
> 2.16.2
>

Re: [PATCH 0/4] Speed up git tag --contains

2018-03-12 Thread Jeff King

On Mon, Mar 12, 2018 at 09:45:27AM -0400, Derrick Stolee wrote:

> > diff --git a/builtin/branch.c b/builtin/branch.c
> > index 8dcc2ed058..4d674e86d5 100644
> > --- a/builtin/branch.c
> > +++ b/builtin/branch.c
> > @@ -404,6 +404,7 @@ static void print_ref_list(struct ref_filter *filter, 
> > struct ref_sorting *sortin
> > memset(, 0, sizeof(array));
> > +   filter->with_commit_tag_algo = 1;
> > filter_refs(, filter, filter->kind | FILTER_REFS_INCLUDE_BROKEN);
> > if (filter->verbose)
> > 
> > drops my run of "git branch -a --contains HEAD~100" from 8.6s to
> > 0.4s on a repo with ~1800 branches. That sounds good, but on a repo with
> > a smaller number of branches, we may actually end up slower (because we
> > dig further down in history, and don't benefit from the multiple-branch
> > speedup).
> 
> It's good to know that we already have an algorithm for the multi-head
> approach. Things like `git branch -vv` are harder to tease out because the
> graph walk is called by the line-format code.

Yeah, the ahead/behind stuff will need some work. Part of it is just
code structuring. We know ahead of time which branches (and their
upstreams) are going to need this ahead/behind computation, so we should
be able to do collect them all for a single call.

But I'm not sure if a general multi-pair ahead/behind is going to be
easy. I don't have even experimental code for that. :)

We have a multi-pair ahead/behind command which we use at GitHub, but it
does each pair separately. It leans heavily on reachability bitmaps, so
the main advantage is that it's able to amortize the cost of loading the
bitmaps (both off disk, but also we sometimes have to walk to complete
the bitmaps).

-Peff

Re: [PATCH 0/4] Speed up git tag --contains

2018-03-12 Thread Derrick Stolee


On 3/3/2018 12:15 AM, Jeff King wrote:

On Fri, Jan 12, 2018 at 10:56:00AM -0800, csilvers wrote:


This is a resubmission of Jeff King's patch series to speed up git tag
--contains with some changes. It's been cooking for a while as:

Replying to this 6-year-old thread:

Is there any chance this could be resurrected?  We are using
phabricator, which uses `git branch --contains` as part of its
workflow.  Our repo has ~1000 branches on it, and the contains
operation is eating up all our CPU (and time).  It would be very
helpful to us to make this faster!

(The original thread is at
https://public-inbox.org/git/e1ou82h-0001xy...@closure.thunk.org/

Sorry, this got thrown on my "to respond" pile and languished.


Thanks for adding me to the thread. It's good to know the pain point 
people are having around commit graph walks.



There are actually three things that make "git branch --contains" slow.

First, if you're filtering 1000 branches, we'll run 1000 merge-base
traversals, which may walk over the same commits multiple times.

These days "tag --contains" uses a different algorithm that can look at
all heads in a single traversal. But the downside is that it's
depth-first, so it tends to walk down to the roots. That's generally OK
for tags, since you often have ancient tags that mean getting close to
the roots anyway.

But for branches, they're more likely to be recent, and you can get away
without going very deep into the history.

So it's a tradeoff. There's no run-time switch to flip between them, but
a patch like this:

diff --git a/builtin/branch.c b/builtin/branch.c
index 8dcc2ed058..4d674e86d5 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -404,6 +404,7 @@ static void print_ref_list(struct ref_filter *filter, 
struct ref_sorting *sortin
  
  	memset(, 0, sizeof(array));
  
+	filter->with_commit_tag_algo = 1;

filter_refs(, filter, filter->kind | FILTER_REFS_INCLUDE_BROKEN);
  
  	if (filter->verbose)


drops my run of "git branch -a --contains HEAD~100" from 8.6s to
0.4s on a repo with ~1800 branches. That sounds good, but on a repo with
a smaller number of branches, we may actually end up slower (because we
dig further down in history, and don't benefit from the multiple-branch
speedup).


It's good to know that we already have an algorithm for the multi-head 
approach. Things like `git branch -vv` are harder to tease out because 
the graph walk is called by the line-format code.



I tried to do a "best of both" algorithm in:

  https://public-inbox.org/git/20140625233429.ga20...@sigill.intra.peff.net/

which finds arbitrary numbers of merge bases in a single traversal.  It
did seem to work, but I felt uneasy about some of the corner cases.
I've been meaning to revisit it, but obviously have never gotten around
to it.

The second slow thing is that during the traversal we load each commit
object from disk. The solution there is to keep the parent information
in a faster cache. I had a few proposals over the years, but I won't
even bother to dig them up, because there's quite recent and promising
work in this area from Derrick Stolee:

   
https://public-inbox.org/git/1519698787-190494-1-git-send-email-dsto...@microsoft.com/

And finally, the thing that the patches you linked are referencing is
about using commit timestamps as a proxy for generation numbers. And
Stolee's patches actually leave room for real, trustable generation
numbers.

Once we have the serialized commit graph and generation numbers, think
the final step would just be to teach the "tag --contains" algorithm to
stop walking down unproductive lines of history. And in fact, I think we
can forget about the best-of-both multi-tip merge-base idea entirely.
Because if you can use the generation numbers to avoid going too deep,
then a depth-first approach is fine. And we'd just want to flip
git-branch over to using that algorithm by default.


I'll keep this in mind as a target for performance measurements in the 
serialized commit graph patch and the following generation number patch.


Thanks,
-Stolee

Re: [PATCH 0/4] Speed up git tag --contains

2018-03-08 Thread csilvers

} I had a few proposals over the years, but I won't even bother to dig
} them up, because there's quite recent and promising work in this
} area from Derrick Stolee:

It sounds like the best thing to do is to wait for this, then.

We managed to convert a bunch of our branches to tags, so our
immediate problem has been resolved.  But I'm sure it will come up
again as more branches are created...

carig

Re: [PATCH 0/4] Teach `--default` to `git-config(1)`

2018-03-05 Thread Taylor Blau

On Mon, Mar 05, 2018 at 06:17:25PM -0800, Taylor Blau wrote:
> Attached is a patch series to introduce `--default` and `--color` to the
> `git-config(1)` builtin with the aim of introducing a consistent interface to
> replace `--get-color`.

This series draws significant inspiration from Soukaina Nait Hmid's work
in:

  
https://public-inbox.org/git/0102015fb0bf2f74-cb456171-fe65-4d83-8784-b553c7c9e584-000...@eu-west-1.amazonses.com/

--
- Taylor

Re: [PATCH 0/4] Speed up git tag --contains

2018-03-02 Thread Jeff King

On Fri, Jan 12, 2018 at 10:56:00AM -0800, csilvers wrote:

> > This is a resubmission of Jeff King's patch series to speed up git tag
> > --contains with some changes. It's been cooking for a while as:
> 
> Replying to this 6-year-old thread:
> 
> Is there any chance this could be resurrected?  We are using
> phabricator, which uses `git branch --contains` as part of its
> workflow.  Our repo has ~1000 branches on it, and the contains
> operation is eating up all our CPU (and time).  It would be very
> helpful to us to make this faster!
> 
> (The original thread is at
> https://public-inbox.org/git/e1ou82h-0001xy...@closure.thunk.org/

Sorry, this got thrown on my "to respond" pile and languished.

There are actually three things that make "git branch --contains" slow.

First, if you're filtering 1000 branches, we'll run 1000 merge-base
traversals, which may walk over the same commits multiple times.

These days "tag --contains" uses a different algorithm that can look at
all heads in a single traversal. But the downside is that it's
depth-first, so it tends to walk down to the roots. That's generally OK
for tags, since you often have ancient tags that mean getting close to
the roots anyway.

But for branches, they're more likely to be recent, and you can get away
without going very deep into the history.

So it's a tradeoff. There's no run-time switch to flip between them, but
a patch like this:

diff --git a/builtin/branch.c b/builtin/branch.c
index 8dcc2ed058..4d674e86d5 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -404,6 +404,7 @@ static void print_ref_list(struct ref_filter *filter, 
struct ref_sorting *sortin

memset(, 0, sizeof(array));

+   filter->with_commit_tag_algo = 1;
filter_refs(, filter, filter->kind | FILTER_REFS_INCLUDE_BROKEN);

if (filter->verbose)

drops my run of "git branch -a --contains HEAD~100" from 8.6s to
0.4s on a repo with ~1800 branches. That sounds good, but on a repo with
a smaller number of branches, we may actually end up slower (because we
dig further down in history, and don't benefit from the multiple-branch
speedup).

I tried to do a "best of both" algorithm in:

 https://public-inbox.org/git/20140625233429.ga20...@sigill.intra.peff.net/

which finds arbitrary numbers of merge bases in a single traversal.  It
did seem to work, but I felt uneasy about some of the corner cases.
I've been meaning to revisit it, but obviously have never gotten around
to it.

The second slow thing is that during the traversal we load each commit
object from disk. The solution there is to keep the parent information
in a faster cache. I had a few proposals over the years, but I won't
even bother to dig them up, because there's quite recent and promising
work in this area from Derrick Stolee:

https://public-inbox.org/git/1519698787-190494-1-git-send-email-dsto...@microsoft.com/

And finally, the thing that the patches you linked are referencing is
about using commit timestamps as a proxy for generation numbers. And
Stolee's patches actually leave room for real, trustable generation
numbers.

Once we have the serialized commit graph and generation numbers, think
the final step would just be to teach the "tag --contains" algorithm to
stop walking down unproductive lines of history. And in fact, I think we
can forget about the best-of-both multi-tip merge-base idea entirely.
Because if you can use the generation numbers to avoid going too deep,
then a depth-first approach is fine. And we'd just want to flip
git-branch over to using that algorithm by default.

-Peff

Re: [PATCH 0/4] Delete ignore_env member in struct repository

2018-02-26 Thread Duy Nguyen

On Tue, Feb 27, 2018 at 3:46 AM, Stefan Beller  wrote:
> What is the plan from here on?
> Should I build further series on top of yours? The next series will
> focus on the pack side of things (pack.h, packfile.{c,h})
>
> So maybe we'll have Junio merge down my series (and yours as it
> needs one reroll?) and then build on top of that?

I think that's the less painful way for everybody.
-- 
Duy

Re: [PATCH 0/4] Delete ignore_env member in struct repository

2018-02-26 Thread Stefan Beller

On Mon, Feb 26, 2018 at 2:30 AM, Nguyễn Thái Ngọc Duy  wrote:
> It turns out I don't need my other series [1] in order to delete this
> field. This series moves getenv() calls from
> repo_set_gitdir()/repo_setup_env() and prepare_alt_odb() back in
> environment.c where they belong in my opinion.
>
> The repo_set_gitdir() now takes $GIT_DIR and optionally all other
> configurable paths. If those paths are NULL, default repo layout will
> be used. With getenv() no longer called inside repo_set_gitdir(),
> ignore_env has no reason to stay. This is in 1/4.
>
> The getenv() in prepare_alt_odb() is also moved back to
> setup_git_env() in 3/4. It demonstrates how we could move other
> getenv() back to if we want.
>
> This series is built on top of Stefan's object-store-part1, v4. I
> could rebase it on 'master' too, but then Junio may need to resolve
> some conflicts.
>

Thanks for working on this,
I found this series a pleasant read, the only issue I saw was Erics upbringing
of multiple getenv calls without strdup()ing the content.

What is the plan from here on?
Should I build further series on top of yours? The next series will
focus on the pack side of things (pack.h, packfile.{c,h})

So maybe we'll have Junio merge down my series (and yours as it
needs one reroll?) and then build on top of that?

Thanks,
Stefan

Re: [PATCH 0/4] Delete ignore_env member in struct repository

2018-02-26 Thread Junio C Hamano

Nguyễn Thái Ngọc Duy   writes:

> This series is built on top of Stefan's object-store-part1, v4. I
> could rebase it on 'master' too, but then Junio may need to resolve
> some conflicts.

As a follow-up fix for combined "move things to the_repo as much as
possible" efforts I think what you did makes most sense.  I would
say that it makes even more sense to make these part of the
object-store series if/when the series needs rerolling--after all
the topic is already a multi-author effort.

I'll have to read them carefully before commenting on the actual
patches ;-)  Thanks.

>
> [1] https://public-inbox.org/git/20180225111840.16421-1-pclo...@gmail.com/
>
> Nguyễn Thái Ngọc Duy (4):
>   repository.c: move env-related setup code back to environment.c
>   repository.c: delete dead functions
>   sha1_file.c: move delayed getenv(altdb) back to setup_git_env()
>   repository: delete ignore_env member
>
>  cache.h|  2 +-
>  environment.c  | 13 +++--
>  object-store.h |  5 +++-
>  object.c   |  1 +
>  repository.c   | 79 ++
>  repository.h   | 21 +++---
>  setup.c|  3 +-
>  sha1_file.c|  6 +---
>  8 files changed, 64 insertions(+), 66 deletions(-)

Re: [PATCH 0/4] Correct offsets of hunks when one is skipped

2018-02-19 Thread Phillip Wood

On 13/02/18 23:56, brian m. carlson wrote:
> On Tue, Feb 13, 2018 at 10:44:04AM +, Phillip Wood wrote:
>> From: Phillip Wood 
>>
>> While working on a patch series to stage selected lines from a hunk
>> without having to edit it I got worried that subsequent patches would
>> be applied in the wrong place which lead to this series to correct the
>> offsets of hunks following those that are skipped or edited.
>>
>> Phillip Wood (4):
>>   add -i: add function to format hunk header
>>   t3701: add failing test for pathological context lines
>>   add -p: Adjust offsets of subsequent hunks when one is skipped
>>   add -p: calculate offset delta for edited patches
>>
>>  git-add--interactive.perl  | 93 
>> +++---
>>  t/t3701-add-interactive.sh | 30 +++
>>  2 files changed, 102 insertions(+), 21 deletions(-)
> 
> This looks reasonably sane to me.  I really like that you managed to
> produce failing tests for this situation.  I know pathological cases
> like this have bit GCC in the past, so it's good that you fixed this.
> 

Thanks Brain, it's interesting to hear that GCC has been bitten in the past

Best Wishes

Phillip

Re: [PATCH 0/4] Correct offsets of hunks when one is skipped

2018-02-13 Thread brian m. carlson

On Tue, Feb 13, 2018 at 10:44:04AM +, Phillip Wood wrote:
> From: Phillip Wood 
> 
> While working on a patch series to stage selected lines from a hunk
> without having to edit it I got worried that subsequent patches would
> be applied in the wrong place which lead to this series to correct the
> offsets of hunks following those that are skipped or edited.
> 
> Phillip Wood (4):
>   add -i: add function to format hunk header
>   t3701: add failing test for pathological context lines
>   add -p: Adjust offsets of subsequent hunks when one is skipped
>   add -p: calculate offset delta for edited patches
> 
>  git-add--interactive.perl  | 93 
> +++---
>  t/t3701-add-interactive.sh | 30 +++
>  2 files changed, 102 insertions(+), 21 deletions(-)

This looks reasonably sane to me.  I really like that you managed to
produce failing tests for this situation.  I know pathological cases
like this have bit GCC in the past, so it's good that you fixed this.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: [PATCH 0/4] Speed up git tag --contains

2018-01-12 Thread csilvers

> This is a resubmission of Jeff King's patch series to speed up git tag
> --contains with some changes. It's been cooking for a while as:

Replying to this 6-year-old thread:

Is there any chance this could be resurrected?  We are using
phabricator, which uses `git branch --contains` as part of its
workflow.  Our repo has ~1000 branches on it, and the contains
operation is eating up all our CPU (and time).  It would be very
helpful to us to make this faster!

(The original thread is at
https://public-inbox.org/git/e1ou82h-0001xy...@closure.thunk.org/
)

craig

Re: [PATCH 0/4] "Fast-"track Git GUI changes

2018-01-10 Thread Johannes Schindelin

Hi,

On Tue, 9 Jan 2018, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > Johannes Schindelin  writes:
> >
> >> As it seems to be impossible to get the attention of the Git GUI
> >> maintainer these "days" (I have opened Pull Requests on October 16th
> >> 2016 that have not even been looked at), let's just side-step that
> >> contribution path which seems to be dormant.
> >
> > Good to see that finally somebody else steps up after I did the same
> > for a few times recently.
> >
> >> These fixes have been in Git for Windows for various amounts of time.
> >>
> >> Note: there are more Git GUI changes in Git for Windows, I only
> >> accumulated the ones I deem wort considering for inclusion into v2.16.0,
> >> still.
> >
> > Thanks.  I am not sure if it is too late for 2.16, as these are not
> > fixes for regression during this cycle, though.
> 
> Heh, I changed my mind.  
> 
> Just like I pretended to be interim maintainer of Git-GUI for the
> past two times, you are doing the same this time and I even agreed
> that it was a good thing that you volunteered to pretend as one.
> 
> So let's follow through the pretence to its conclusion and merge
> these directly to 'master'.

Thank you, fellow pretender.

Ciao,
Dscho

Re: [PATCH 0/4] "Fast-"track Git GUI changes

2018-01-09 Thread Junio C Hamano

Junio C Hamano  writes:

> Johannes Schindelin  writes:
>
>> As it seems to be impossible to get the attention of the Git GUI
>> maintainer these "days" (I have opened Pull Requests on October 16th
>> 2016 that have not even been looked at), let's just side-step that
>> contribution path which seems to be dormant.
>
> Good to see that finally somebody else steps up after I did the same
> for a few times recently.
>
>> These fixes have been in Git for Windows for various amounts of time.
>>
>> Note: there are more Git GUI changes in Git for Windows, I only
>> accumulated the ones I deem wort considering for inclusion into v2.16.0,
>> still.
>
> Thanks.  I am not sure if it is too late for 2.16, as these are not
> fixes for regression during this cycle, though.

Heh, I changed my mind.  

Just like I pretended to be interim maintainer of Git-GUI for the
past two times, you are doing the same this time and I even agreed
that it was a good thing that you volunteered to pretend as one.

So let's follow through the pretence to its conclusion and merge
these directly to 'master'.

Thanks.

Re: [PATCH 0/4] "Fast-"track Git GUI changes

2018-01-09 Thread Junio C Hamano

Johannes Schindelin  writes:

> As it seems to be impossible to get the attention of the Git GUI
> maintainer these "days" (I have opened Pull Requests on October 16th
> 2016 that have not even been looked at), let's just side-step that
> contribution path which seems to be dormant.

Good to see that finally somebody else steps up after I did the same
for a few times recently.

> These fixes have been in Git for Windows for various amounts of time.
>
> Note: there are more Git GUI changes in Git for Windows, I only
> accumulated the ones I deem wort considering for inclusion into v2.16.0,
> still.

Thanks.  I am not sure if it is too late for 2.16, as these are not
fixes for regression during this cycle, though.

Re: [PATCH 0/4] Add --no-ahead-behind to status

2017-12-20 Thread Jeff King

On Wed, Dec 20, 2017 at 02:42:41PM +, Jeff Hostetler wrote:

> From: Jeff Hostetler 
> 
> This patch series adds a "--no-ahead-behind" option to status
> to request that it avoid a possibly expensive ahead/behind
> computation for the current branch.  Instead, it just prints a
> not up to date message in place of the detailed counts.
> 
> This idea was previously discussed in [1].  Working with the
> enormous Windows repository, we found that 20+ seconds was being
> spent in the ahead/behind computation when the current branch was
> 150K commits behind the upstream branch.  (Yes, this happens and
> only took 3 weeks on the reporter's system.)

Overall this looks cleanly done. I raised a few minor points in the
individual patches, but certainly nothing that would be a show-stopper
for the general idea.

> I've only modified "git status" in this patch series.  A similar
> change could be added to "git branch -vv" and "git checkout" to
> avoid delays there too.  I avoided doing it here to keep this
> patch series focused.

I have a feeling that the same user who got annoyed by "git status" is
going to get annoyed by "git checkout" sooner or later. I'm OK with
handling the other commands separately, but I suspect we may want at
least "git checkout" support in the near future.

There is one thing that it's worth thinking about (because it will be
hard to take back later): config option naming. If your repository is so
large that ahead/behind checks are annoying, would you want to be able
to set a single config to tell Git that, rather than one for each
command? If so, then do we want to ditch "status.aheadbehind" in favor
of a more unified name?

-Peff

Re: [PATCH 0/4] Fix issues with rename detection limits

2017-11-10 Thread Elijah Newren

On Fri, Nov 10, 2017 at 10:13 AM, Elijah Newren  wrote:
> In a repo with around 60k files with deep directory hierarchies (due to

> Elijah Newren (4):
>   sequencer: Warn when internal merge may be suboptimal due to
> renameLimit
>   Remove silent clamp of renameLimit
>   progress: Fix progress meters when dealing with lots of work
>   sequencer: Show rename progress during cherry picks

Sorry for the duplicate send.

I noticed the cover letter didn't appear on in the email list,
double-checked https://public-inbox.org/git/, waited over half an hour
and double checked both email and public-inbox again and still didn't
see it, so I re-sent.  Just as soon as I did, it seems that the
original and the new cover letter emails suddenly showed up right
then.  *shrug*.

Re: [PATCH 0/4] WIP: git-remote-media wiki namespace support

2017-10-30 Thread Antoine Beaupré

On 2017-10-30 11:40:06, Matthieu Moy wrote:
> Antoine Beaupré  writes:
>
>> Obviously, doing unit tests against a full MediaWiki instance isn't
>> exactly trivial.
>
> Not trivial, but doable: there is all the infrastructure to do so in t/:
> install-wiki.sh to automatically install Mediawiki, and then a testsuite
> that interacts with it.
>
> This has been written under the assumption that the developer had a
> lighttpd instance running on localhost, but this can probably be adapted
> to run on Travis-CI (install lighttpd & Mediawiki in the install: part,
> and run the tests afterwards), so that anyone can run the tests by just
> submitting a pull-request to Git-Mediawiki.
>
> If you are to work more on Git-Mediawiki, don't underestimate the
> usefullness of the testsuite (for example, Git-Mediawiki was developped
> against a prehistoric version of Mediawiki, the testsuite can help
> ensuring it still works on the lastest version), nor the fun of playing
> with install scripts and CI systems ;-).

Hello!

Glad to hear from you. :)

So I actually tried install-wiki.sh, and it "failed to start lighttpd"
and told me to see logs. I couldn't find them and stopped there...

It would be great to hook this up into CI somewhere, but I suspect it
isn't considering how it doesn't actually work out of the box.

I'm hoping we can still do things and fix some things without going
through that trouble, but I recognize it would be better to have unit
tests operational.

Honestly, I would prefer just having this thing work and not have to
work on it. :) I have lots of things on my plate and I'm just scratching
an itch on this one - some backup script broke and I am trying to fix
it. Once it works, my work is done, so unfortunately I cannot lead that
project (but I'd be happy to help when I can of course).

A.

-- 
The greatest tragedy in mankind's entire history may be the hijacking of
morality by religion.
- Arthur C. Clarke

Re: [PATCH 0/4] WIP: git-remote-media wiki namespace support

2017-10-30 Thread Matthieu Moy

Antoine Beaupré  writes:

> Obviously, doing unit tests against a full MediaWiki instance isn't
> exactly trivial.

Not trivial, but doable: there is all the infrastructure to do so in t/:
install-wiki.sh to automatically install Mediawiki, and then a testsuite
that interacts with it.

This has been written under the assumption that the developer had a
lighttpd instance running on localhost, but this can probably be adapted
to run on Travis-CI (install lighttpd & Mediawiki in the install: part,
and run the tests afterwards), so that anyone can run the tests by just
submitting a pull-request to Git-Mediawiki.

If you are to work more on Git-Mediawiki, don't underestimate the
usefullness of the testsuite (for example, Git-Mediawiki was developped
against a prehistoric version of Mediawiki, the testsuite can help
ensuring it still works on the lastest version), nor the fun of playing
with install scripts and CI systems ;-).

Cheers,

-- 
Matthieu Moy
https://matthieu-moy.fr/

Re: [PATCH 0/4] (x)diff cleanup: remove duplicate code

2017-10-24 Thread Stefan Beller

On Tue, Oct 24, 2017 at 11:59 AM, Stefan Beller  wrote:
> Junio wrote:
>
>>  * I was hoping that the next_byte() and string_hash() thing, once
>>they are cleaned up, will eventually be shared with the xdiff/
>>code at the lower layer, which needs to do pretty much the same
>>in order to implement various whitespace ignoring options.  I am
>>not sure how well the approach taken by the WIP patch meshes with
>>the needs of the lower layer.
>
> This series does exactly this; although I chose to reuse the code in
> xdiff/xutils.c instead of the new fancy next_byte/string_hash, as that
> code has seen more exercise already (hence I assume it has fewer bugs
> if any as well as its performance implications are well understood).
>
> However to do so, we need to pollute xdiff/xutils.c and include
> hashmap.h there (which also requires cache.h as that header has
> an inline function using BUG()), which I find a bit unfortunate but
> worth the trade off of using better tested code.
>

This, of course, applies on top of jk/diff-color-moved-fix.

Re: [PATCH 0/4] fsmonitor fixes

2017-10-20 Thread Johannes Schindelin

Hi Alex,

On Thu, 19 Oct 2017, Alex Vandiver wrote:

> A few fixes found from playing around with the fsmonitor branch in
> next.

Thank you for having a look, and even better: for sending a couple of
improvements on top of Ben's & my work.

I sent a couple of suggestions and hope you will find the useful?

Ciao,
Johannes

Re: [PATCH 0/4] pre-merge hook

2017-10-16 Thread Junio C Hamano

Michael J Gruber  writes:

> This series revives an old suggestion of mine to make merge honor
> pre-commit hook or a separate pre-merge hook

This seems to have become an abandoned loose end, so I'll drop the
topic from my tree for now; revival of the discussion is _not_
unwelcome (aka "dropping without prejudice").

Re: [PATCH 0/4] pre-merge hook

2017-10-02 Thread Stefan Beller

On Sun, Oct 1, 2017 at 10:54 PM, Junio C Hamano  wrote:
> Michael J Gruber  writes:
>
>> Now that f8b863598c ("builtin/merge: honor commit-msg hook for merges",
>> 2017-09-07) has landed, merge is getting closer to behaving like commit,
>> which is important because both are used to complete merges (automatic
>> vs. non-automatic).
>
> Just a gentle ping to the thread to see if it is still alive.
>
> I think people agree that this is a good thing to do, but it seems
> that the execution leaves a few loose ends, judging from the review
> comments that have yet to be answered.
>
> Thanks.

I agree. IIRC the series was ok in design and goal, just minor comments
for coding.

Thanks,
Stefan

Re: [PATCH 0/4] pre-merge hook

2017-10-01 Thread Junio C Hamano

Michael J Gruber  writes:

> Now that f8b863598c ("builtin/merge: honor commit-msg hook for merges",
> 2017-09-07) has landed, merge is getting closer to behaving like commit,
> which is important because both are used to complete merges (automatic
> vs. non-automatic).

Just a gentle ping to the thread to see if it is still alive.

I think people agree that this is a good thing to do, but it seems
that the execution leaves a few loose ends, judging from the review
comments that have yet to be answered.

Thanks.

Re: [PATCH 0/4] Fixes from the per-repository object store series

2017-09-14 Thread Junio C Hamano

Jonathan Nieder  writes:

> This is a continuation of the series at [1].  That was part 1 ---
> you can think of this as part 0, since it contains the simplest and
> least controversial part of the series --- each patch in this series
> is a bugfix in its own right.
>
> Patch 1 should be familiar if you reviewed the series [1].  It is
> unchanged from the patch there, except to note Peff's ack.
>
> Patches 2-4 have not yet visited the list but are fixes (or, in the
> case of patch 4, cleanups) noticed as part of the same process of
> teaching object store code to handle multiple repositories.
>
> We hope that splitting these out should make them easier to review
> and for people to benefit from these fixes without having to wait
> for the rest of the series to settle.

One thing that is unclear is if you are tentatively withdrawing the
longer series with this update, but I'd assume that it is the case
because these two obviously will conflict with each other, and the
most of what these four patches do are subset of what the larger one
you sent earlier do.

Thanks.

Re: [PATCH 0/4] Test name-rev with small stack

2017-09-11 Thread Jeff King

On Fri, Sep 08, 2017 at 02:33:35PM +0200, Michael J Gruber wrote:

> Looking at it more closely, the solution in cbc60b6720 ("git tag
> --contains: avoid stack overflow", 2014-04-24) seems to be a bit "ad
> hoc" to me:
> 
> First of all, there is more than "tag --contains" that may exceed the
> stack by recursion. So I would expect the solution to be more general,
> and not localised and specialised to builtin/tag.c

At the time, tag was the only one using this depth-first contains
algorithm. It's since been adapted to ref-filter.c, but of course the
stack handling went with it.

Most traversals have a date-sorted queue, so are effectively doing a
breadth-first iteration with no recursion.

> Second, this is a walk, so I'm wondering whether our revision walk
> machinery should be the place to add missing functionality (if any).
> That way, everything would benefit from possible or existing
> improvements there. For example, I think some of our "extra walkers"
> don't heed object replacements. (I need to test more.)

It's possible that name-rev could make better use of the existing
traversal machinery. It's often tough to do so, though, because that
machinery gives you a linearized ordering of the commits. Whereas
something like name-rev really cares about the order that it visits the
commits, because it's building up the names.

It's the same for this "tag --contains" traversal. It _used_ to be a
series of merge-base computations. But by doing a custom traversal, we
can cache incremental results through the graph and avoid walking over
the same bits multiple times. There actually is a way to do it with the
regular breadth-first traversal, but you have to store one bit per ref
you're checking for each commit.

I played around with that a bit a while ago, and it did seem to work. I
can dig up the patches if you're interested. But one downside is that
one bit per ref per commit adds up if you have a lot of refs. A large
number of those bitfields will be the same, so you could probably get by
with a copy-on-write scheme, but I never implemented that.

Of course somebody may have a more clever algorithm, too. I don't claim
the above is a proof. ;)

-Peff

Re: [PATCH 0/4] Test name-rev with small stack

2017-09-08 Thread Michael J Gruber

Jeff King venit, vidit, dixit 07.09.2017 16:54:
> On Thu, Sep 07, 2017 at 04:02:19PM +0200, Michael J Gruber wrote:
> 
>> name-rev segfaults for me in emacs.git with the typical 8102 stack size.
>> The reason is the recursive walk that name-rev uses.
>>
>> This series adds a test to mark this as known failure, after some
>> clean-ups.
> 
> These all look reasonable to me. The size of the test case in the final
> one is presumably arbitrary and just copied from t7004. I don't know if
> it's worth trying to shrink it. It could shorten a rather expensive
> test. OTOH, if we shorten it too much then we might get a false pass
> (e.g., if the algorithm remains recursive but has a smaller stack
> footprint).
> 
>> Michael J Gruber (4):
>>   t7004: move limited stack prereq to test-lib
>>   t6120: test name-rev --all and --stdin
>>   t6120: clean up state after breaking repo
>>   t6120: test describe and name-rev with deep repos
> 
> Now comes the hard part: rewriting the C code. :)

Looking at it more closely, the solution in cbc60b6720 ("git tag
--contains: avoid stack overflow", 2014-04-24) seems to be a bit "ad
hoc" to me:

First of all, there is more than "tag --contains" that may exceed the
stack by recursion. So I would expect the solution to be more general,
and not localised and specialised to builtin/tag.c

Second, this is a walk, so I'm wondering whether our revision walk
machinery should be the place to add missing functionality (if any).
That way, everything would benefit from possible or existing
improvements there. For example, I think some of our "extra walkers"
don't heed object replacements. (I need to test more.)

Michael

Re: [PATCH 0/4] Test name-rev with small stack

2017-09-07 Thread Jeff King

On Thu, Sep 07, 2017 at 04:02:19PM +0200, Michael J Gruber wrote:

> name-rev segfaults for me in emacs.git with the typical 8102 stack size.
> The reason is the recursive walk that name-rev uses.
> 
> This series adds a test to mark this as known failure, after some
> clean-ups.

These all look reasonable to me. The size of the test case in the final
one is presumably arbitrary and just copied from t7004. I don't know if
it's worth trying to shrink it. It could shorten a rather expensive
test. OTOH, if we shorten it too much then we might get a false pass
(e.g., if the algorithm remains recursive but has a smaller stack
footprint).

> Michael J Gruber (4):
>   t7004: move limited stack prereq to test-lib
>   t6120: test name-rev --all and --stdin
>   t6120: clean up state after breaking repo
>   t6120: test describe and name-rev with deep repos

Now comes the hard part: rewriting the C code. :)

-Peff

Re: [PATCH 0/4] vcs-svn: remove repo_tree library

2017-08-22 Thread Stefan Beller

On Tue, Aug 22, 2017 at 4:37 PM, Jonathan Nieder  wrote:
> Hi,
>
> Stefan noticed that repo_init from vcs-svn/repo_tree.h conflicts with
> repository.h[1].  Earlier brian m. carlson noticed the same thing[2].
>
> Originally repo_tree.h was used to manage an in-memory representation
> of the state of the svn tree being imported.  When that in-memory
> representation was retired, we were lazy and left some utility
> functions there.  Here is a patch series to finish cleaning up and
> remove vcs-svn/repo_tree.h completely.
>
> This is an alternative to bc/vcs-svn-cleanup from 'next'.  Those
> patches weren't cc-ed to me and I missed them --- sorry about that.  I
> can rebase on top of them if that is more convenient.

A rebased version would be easier IIUC Junios reply to
the one-off that I sent.

The patches look good.

Thanks,
Stefan

>
> Thoughts of all kinds welcome, as always.
>
> Thanks,
> Jonathan
>
> Jonathan Nieder (4):
>   vcs-svn: remove prototypes for missing functions
>   vcs-svn: remove custom mode constants
>   vcs-svn: remove repo_delete wrapper function
>   vcs-svn: move remaining repo_tree functions to fast_export.h
>
>  Makefile  |  1 -
>  vcs-svn/fast_export.c | 41 +
>  vcs-svn/fast_export.h |  3 +++
>  vcs-svn/repo_tree.c   | 48 
>  vcs-svn/repo_tree.h   | 23 ---
>  vcs-svn/svndump.c | 33 -
>  6 files changed, 56 insertions(+), 93 deletions(-)
>  delete mode 100644 vcs-svn/repo_tree.c
>  delete mode 100644 vcs-svn/repo_tree.h
>
> [1] https://public-inbox.org/git/20170822213501.5928-1-sbel...@google.com
> [2] 
> https://public-inbox.org/git/2017082122.26729-3-sand...@crustytoothpaste.net

Re: [PATCH 0/4] dropping support for older curl

2017-08-14 Thread Johannes Schindelin

Hi Paolo,

On Thu, 10 Aug 2017, Paolo Ciarrocchi wrote:

> Il 10 ago 2017 11:39 AM, "Johannes Schindelin" 
> ha scritto:
> 
> 
> 
> Footnote *1*: It is no secret that I find our patch submission less than
> inviting. Granted, *I* use it. *I* did not have problems entering the
> mailing list. But then, my mails were not swallowed silently, because my
> mail program does not send HTML by default. And prepared by the German
> school system (I learned the term "sugar coating" only when exposed to
> some US culture), I had little emotional problems with being criticized
> and not thanked for my contribution, I persisted nevertheless. The opinion
> that the Git contribution process is a lot less inviting than it could be
> is not only my view, by the way. I hear this a lot. I give you that we are
> not quite as textbook "keep out from here unless you look like us, smell
> like us, talk like us, have the same genital setup like us" as the Linux
> kernel mailing list, but we are in a different universe compared to, say,
> the Drupal community. And their universe is a lot nicer to live in.
> 
> 
> Isn't SumbitGit a possible answer to your doubts (I strongly agree with
> you) about the current development process?

No. I hate to say that SubmitGit neither integrates well with GitHub Pull
Requests (code comments on GitHub are approximately 1,523x easier to
write, read, associate with the actual code, and see the current state of,
compared to the mailing list, and SubmitGit does not even hint at
integrating with that user experience).

Also, the barrier to start using SubmitGit is rather high. If you open a
Pull Request on github.com/git/git, you get *no* indication that SubmitGit
is an option to *actually* get the code into Git. There are also concerns
about required permissions that Junio Hamano himself would not accept.

Now, let's assume that you submitted the code via SubmitGit. The
challenges of the patch submission process do not end there, yet SubmitGit
goes home and has a beer. But the hard part, the discussions on the
mailing list, the status updates in the completely separate What's cooking
mails, the missing links back to the original source code (let alone the
information in which worktree on your computer under which branch name
that topic was developed again?), the diverging mail threads, the
"rerolls" that should not forget to Cc: all reviewers of previous rounds,
all that jazz is still all very, very manual.

And even if there was an easier path from having a local branch that works
to finally getting it onto the list in the required form, your mail is an
eloquent example of one of the most preposterous hurdles along the way: we
pride ourselves with the openness we demonstrate by communicating via a
mailing list, everybody has a mail address, amirite? But of course, HTML
mails, like, about 130% of all mails on the internet (or so it feels),
including yours, are dropped. Silently. Not so open anymore, are we.

It is all so frustrating, really. I work in a team (Visual Studio Team
Services, you can think of it as kind of a Microsoft take on Git hosting
plus a lot of other tooling) where we get really, really positive feedback
regarding our user experience, in particular the frequent enhancements to
PRs. It is really powerful to open, review and merge PRs, interact with
other developers, see the state of PRs and their respective discussions,
open and resolve issues, automate workflows and build tasks [*1*]. And
then I try to convince people here on this mailing list that it really
makes a difference if you start using tools to lighten the load of
everybody, and... little changes.

At least thanks to Lars Schneider's incredible efforts we have Continuous
Testing (I know how much time he spent on this, it is another thing
deserving the label "preposterous", and I know how much time I spent on
top to add the Windows part which mostly works). If only we could push it
further to true Continuous Integration. Or at least to accepting PRs from
professionals who simply have no time to fight with our patch contribution
process and whose expertise we lose as a consequence.

Ciao,
Johannes

Footnote *1*: The funniest part about this is that I do get mails about
all of this all the time. When I am pulled in as a reviewer. When a build
failed. When a previously failing task was fixed by a new build. When
somebody responded to my comments. The difference to the mailing
list-centric approach is of course that those mails are only
notifications, and link back to the tool appropriately supporting what
I, the user, want to get done.

Re: [PATCH 0/4] dropping support for older curl

On Thu, Aug 10, 2017 at 07:09:02PM -0400, Jeff King wrote:

> The first is "should we eventually drop support for antiquated versions
> of dependencies?". And the argument in favor is the one I was making
> here: besides lowering maintenance cost, it is more honest to our users
> about what to expect[1].

As usual, I forgot all my footnotes.

[1] When I've talked about keeping expectations reasonable, I'm a lot
less interested in "oops, I built Git and this particular feature
didn't work". It's easy to write that off as "well, you have an old
version of curl, patches welcome". I'm much more concerned about
security issues. Curl is network-facing. Leaving aside security
vulnerabilities in curl itself (which hopefully distros with 10-year
support periods would fix), I wouldn't be surprised if there are
bad interactions possible due to our tangle of ifdefs.

One way to address that would be more careful auditing. But then
that goes back to the cost/benefit thing.

> One is to do it by date and what dependencies are in long-term OS
> releases, and then compare that to the benefit. Requiring curl 7.11.1
> still keeps us working back to rhel4, which was already end-of-lifed
> completely after a 12 year run. Bumping to 7.16.0 drops rhel4 and rhel5,
> the latter of which is in its final "barely supported" phase after 10
> years. But it gives us a bit more bang for our buck by making CURL_MULTI
> uconditional[2].  Requiring 7.19.4 actually doesn't drop any more rhel
> releases. So by that metric, we might as well go there.

[2] The line-count change from dropping CURL_MULTI isn't _too_ exciting.
But a lot of the tangled design of our http code revolves around
the abstractions we've introduced. I have a feeling that it will
enable further cleanups as we move forward (OTOH, a lot of the worst
parts of our design are because of _using_ curl_multi for dumb http,
which of course hardly anyone does these days. But I have a feeling
if I suggested removing that, people would really scream).

-Peff

Re: [PATCH 0/4] dropping support for older curl


On 11/08/17 01:23, Jeff King wrote:

On Fri, Aug 11, 2017 at 01:17:51AM +0200, Tom G. Christensen wrote:


OK, thanks for double-checking. I'm still puzzled why your build
succeeds and mine does not.


I know what's going on now and it's so simple.
Red Hats version of curl 7.15.5 includes a number of patches including one
that backports support for CURLPROTO_* (as part of a fix for CVE-2009-0037).
I haven't checked el6 but I would not be surprised if there where similar
things going on there.


el6 should have it already as part of 7.19.7, right?



Yes of course.


So in conclusion version based #ifdefs are misleading when used with curl as
shipped with RHEL.


Yeah, that's certainly an interesting finding. In this case your builds
are missing out on redirect protection that we _could_ be providing.



Yes and I'm looking into that right now.


If we do keep the compat ifdefs around this feature, it may be worth
converting them to "#ifdef CURLPROTO_HTTP" to more directly check the
feature.



Yes, a feature test would be better.

-tgc

Re: [PATCH 0/4] dropping support for older curl

On Fri, Aug 11, 2017 at 01:17:51AM +0200, Tom G. Christensen wrote:

> > OK, thanks for double-checking. I'm still puzzled why your build
> > succeeds and mine does not.
> 
> I know what's going on now and it's so simple.
> Red Hats version of curl 7.15.5 includes a number of patches including one
> that backports support for CURLPROTO_* (as part of a fix for CVE-2009-0037).
> I haven't checked el6 but I would not be surprised if there where similar
> things going on there.

el6 should have it already as part of 7.19.7, right?

> So in conclusion version based #ifdefs are misleading when used with curl as
> shipped with RHEL.

Yeah, that's certainly an interesting finding. In this case your builds
are missing out on redirect protection that we _could_ be providing.

If we do keep the compat ifdefs around this feature, it may be worth
converting them to "#ifdef CURLPROTO_HTTP" to more directly check the
feature.

-Peff

Re: [PATCH 0/4] dropping support for older curl


On 11/08/17 00:54, Jeff King wrote:

On Fri, Aug 11, 2017 at 12:23:42AM +0200, Tom G. Christensen wrote:



Er, sorry if I'm being dense, but how? Are you suggesting that by
removing the callsite of get_curl_allowed_protocols(), the compiler
might elide the now-dead code completely? I could certainly see it being
dropped after the compilation, but I'm surprised that it wouldn't
complain about the undeclared identifiers in the first place.


You're right, that should not be able to handle it.


Can you please double-check that you're
building against the correct version of curl, and that you are building
the HTTP parts of Git (which _are_ optional, and the test suite will
pass without them).


I use a mock buildroot and there is no other curl than the vendor supplied
7.15.5 installed:
[...]


OK, thanks for double-checking. I'm still puzzled why your build
succeeds and mine does not.



I know what's going on now and it's so simple.
Red Hats version of curl 7.15.5 includes a number of patches including 
one that backports support for CURLPROTO_* (as part of a fix for 
CVE-2009-0037).
I haven't checked el6 but I would not be surprised if there where 
similar things going on there.


So in conclusion version based #ifdefs are misleading when used with 
curl as shipped with RHEL.


-tgc

Re: [PATCH 0/4] dropping support for older curl

On Thu, Aug 10, 2017 at 03:17:06PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > On Thu, Aug 10, 2017 at 11:36:41AM +0200, Johannes Schindelin wrote:
> >
> >> Hopefully I had better luck expressing my concerns this time?
> >
> > I understand your argument much better now. I'm still not sure I agree.
> >
> > -Peff
> 
> I do not think "there are a dozen #ifdefs and I don't know whether
> they still work. I don't know whether anybody (who most likely has
> better things to do than read the Git mailing list) is still using
> those.  So let's just remove them." was why you were suggesting to
> clean up the (apparent) support of older curl in the code, though.
> 
> Isn't the reason why your series simplifies these #ifdefs away
> because we by accident started using some features that require a
> version that is even newer than any of these #ifdef's try to cater
> to and yet nobody complained?  That is a lot more similar to the
> removal of rsync transport that happened in a not so distant past,
> where the reason for removal was "We have been shipping code that
> couldn't have possibly worked for some time and nobody complained
> ---we know nobody is depending on it."

I think there are two questions to be asked, and their answers come from
different lines of reasoning.

The first is "should we eventually drop support for antiquated versions
of dependencies?". And the argument in favor is the one I was making
here: besides lowering maintenance cost, it is more honest to our users
about what to expect[1].

The second is "how far back should we keep support?".

And there are two lines of thought there.

One is to do it by date and what dependencies are in long-term OS
releases, and then compare that to the benefit. Requiring curl 7.11.1
still keeps us working back to rhel4, which was already end-of-lifed
completely after a 12 year run. Bumping to 7.16.0 drops rhel4 and rhel5,
the latter of which is in its final "barely supported" phase after 10
years. But it gives us a bit more bang for our buck by making CURL_MULTI
uconditional[2].  Requiring 7.19.4 actually doesn't drop any more rhel
releases. So by that metric, we might as well go there.

And the second line of thought is: it was already broken and nobody
reported it or offered up a fix. And that's where the "similar to rsync"
thing comes in. Though in this case we do have some evidence that people
(at least Tom) was patching and distributing behind the scenes. And our
breakage period was much shorter (since v2.12.0, but that's only months
or so).

-Peff

Re: [PATCH 0/4] dropping support for older curl

On Fri, Aug 11, 2017 at 12:23:42AM +0200, Tom G. Christensen wrote:

> > > I just built a pristine 2.14.0 on CentOS 5 with curl 7.15.5. No problems 
> > > at
> > > all neither with building nor with running the testsuite.
> > 
> > As you can see, this does not compile for me. What's going on?
> > 
> The call site for get_curl_allowed_protocols() in http.c is still protected
> by an #if:
> #if LIBCURL_VERSION_NUM >= 0x071304
> curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS,
>  get_curl_allowed_protocols(0));
> curl_easy_setopt(result, CURLOPT_PROTOCOLS,
>  get_curl_allowed_protocols(-1));
> #else
> warning("protocol restrictions not applied to curl redirects
> because\n"
> "your curl version is too old (>= 7.19.4)");
> #endif
> 
> > I don't see how it could work, as CURLPROTO_HTTP is not defined at all
> > in that version of curl.
> 
> Indeed but the #if will handle that.

Er, sorry if I'm being dense, but how? Are you suggesting that by
removing the callsite of get_curl_allowed_protocols(), the compiler
might elide the now-dead code completely? I could certainly see it being
dropped after the compilation, but I'm surprised that it wouldn't
complain about the undeclared identifiers in the first place.

And if that _is_ what is happening...that seems like a very fragile and
unportable thing to be depending on.

> > Can you please double-check that you're
> > building against the correct version of curl, and that you are building
> > the HTTP parts of Git (which _are_ optional, and the test suite will
> > pass without them).
> 
> I use a mock buildroot and there is no other curl than the vendor supplied
> 7.15.5 installed:
> [...]

OK, thanks for double-checking. I'm still puzzled why your build
succeeds and mine does not.

> > I saw that, too. But as I understand it, they provide no code updates:
> > no bugfixes and no security updates. They just promise to answer the
> > phone and help you with troubleshooting. It's possible my perception is
> > wrong, though; I'm certainly not one of their customers.
> 
> I am refering to the Extended Life-cycle Support product (ELS), which
> promises:
> "the ELS Add-On delivers certain critical-impact security fixes and selected
> urgent priority bug fixes and troubleshooting for the last minor release"
> 
> The full description is here:
> https://access.redhat.com/support/policy/updates/errata#Extended_Life_Cycle_Phase

That was the same page I was looking at. The bit I read was:

  For versions of products in the Extended Life Phase, Red Hat will
  provide limited ongoing technical support. No bug fixes, security
  fixes, hardware enablement or root-cause analysis will be available
  during this phase, and support will be provided on existing
  installations only.

But I missed the bit about the "ELS add-on" below there, which I guess
is an extra thing. I do suspect that "install arbitrary new versions of
Git" is outside of their scope of "urgent priority bug fixes". But in a
sense it doesn't really matter. What is much more interesting is whether
there's a significant population that is running RHEL5 and has a strong
need for newer versions of Git. That I'm not sure about.

-Peff

Re: [PATCH 0/4] dropping support for older curl


On 10/08/17 23:32, Jeff King wrote:

On Thu, Aug 10, 2017 at 10:33:18PM +0200, Tom G. Christensen wrote:


You've totally ignored the argument I made back then[1], and which I
reiterated in this thread. So I'll say it one more time: the more
compelling reason is not the #ifdefs, but the fact that the older
versions are totally untested.


Perhaps you forgot but I stated in the original thread that I build RPMS for
RHEL/CentOS 3, 4, 5, 6 and 7. I still do and I run the testsuite every
single time.


I didn't forget. I actually double-checked the patches you sent at the
time, but I didn't see one for the CURLPROTO issue. And indeed, it is
still broken for me:

   $ cd /path/to/curl/repo
   $ git checkout curl-7_15_5
   $ ./buildconf && ./configure --prefix=/tmp/foo && make install
   $ cd /path/to/git
   $ git checkout v2.14.0
   $ make CURLDIR=/tmp/foo V=1 http.o
   gcc -o http.o -c -MF ./.depend/http.o.d -MQ http.o -MMD -MP   -g -O0 -Wall -Werror -Wdeclaration-after-statement -Wpointer-arith -Wstrict-prototypes -Wvla 
-Wold-style-declaration -Wold-style-definition -Wno-error -Wno-cpp -Wno-unused-value -Wno-strict-prototypes  -I. -DUSE_LIBPCRE1 -DHAVE_ALLOCA_H 
-I/tmp/foo/include -DUSE_CURL_FOR_IMAP_SEND -DNO_GETTEXT -DSHA1_DC -DSHA1DC_NO_STANDARD_INCLUDES -DSHA1DC_INIT_SAFE_HASH_DEFAULT=0 
-DSHA1DC_CUSTOM_INCLUDE_SHA1_C="\"cache.h\"" -DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_C="\"sha1dc_git.c\"" 
-DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_H="\"sha1dc_git.h\"" -DSHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C="\"git-compat-util.h\""  
-DHAVE_PATHS_H -DHAVE_DEV_TTY -DHAVE_CLOCK_GETTIME -DHAVE_CLOCK_MONOTONIC -DHAVE_GETDELIM  -DFREAD_READS_DIRECTORIES -DNO_STRLCPY -DSHELL_PATH='
  "/bin/sh"' -DPAGER_ENV='"LESS=FRX LV=-c"'  http.c
   http.c: In function ‘get_curl_allowed_protocols’:
   http.c:685:24: error: ‘CURLPROTO_HTTP’ undeclared (first use in this 
function); did you mean ‘CURLPROXY_HTTP’?
  allowed_protocols |= CURLPROTO_HTTP;
   ^~
   CURLPROXY_HTTP
   [and so on]


I just built a pristine 2.14.0 on CentOS 5 with curl 7.15.5. No problems at
all neither with building nor with running the testsuite.


As you can see, this does not compile for me. What's going on?

The call site for get_curl_allowed_protocols() in http.c is still 
protected by an #if:

#if LIBCURL_VERSION_NUM >= 0x071304
curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS,
 get_curl_allowed_protocols(0));
curl_easy_setopt(result, CURLOPT_PROTOCOLS,
 get_curl_allowed_protocols(-1));
#else
warning("protocol restrictions not applied to curl redirects 
because\n"

"your curl version is too old (>= 7.19.4)");
#endif


I don't see how it could work, as CURLPROTO_HTTP is not defined at all
in that version of curl. 


Indeed but the #if will handle that.


Can you please double-check that you're
building against the correct version of curl, and that you are building
the HTTP parts of Git (which _are_ optional, and the test suite will
pass without them).



I use a mock buildroot and there is no other curl than the vendor 
supplied 7.15.5 installed:

# pwd
/var/lib/mock/jrpms-el5-x86_64/root
# find . -name 'curlver.h'
./usr/include/curl/curlver.h
# grep LIBCURL_VERSION_NUM ./usr/include/curl/curlver.h
   parsing and comparions by programs. The LIBCURL_VERSION_NUM define will
#define LIBCURL_VERSION_NUM 0x070f05
#

[root@c5-32bit-01 ~]# rpm -q git
git-2.14.1-1.el5.jr
[root@c5-32bit-01 ~]# ldd /usr/libexec/git-core/git-http-fetch |grep libcurl
libcurl.so.3 => /usr/lib/libcurl.so.3 (0x001e7000)
[root@c5-32bit-01 ~]# rpm -qf /usr/lib/libcurl.so.3
curl-7.15.5-17.el5_9
[root@c5-32bit-01 ~]# git --version
git version 2.14.1
[root@c5-32bit-01 ~]# git clone 
https://github.com/tgc/tgcware-for-solaris.git

Cloning into 'tgcware-for-solaris'...
warning: protocol restrictions not applied to curl redirects because
your curl version is too old (>= 7.19.4)
remote: Counting objects: 2793, done.
remote: Total 2793 (delta 0), reused 0 (delta 0), pack-reused 2793
Receiving objects: 100% (2793/2793), 780.88 KiB | 639.00 KiB/s, done.
Resolving deltas: 100% (1233/1233), done.
[root@c5-32bit-01 ~]#




I also won't claim any absolutes. I think we all agree this is a
cost/benefit tradeoff. But there are a lot of options for building on a
very old system. For instance, building without http if you don't need
it. Or building a more recent libcurl (and even linking statically for
simplicity).



Of course that is always an option but it does complicate things.


I'd find arguments against the latter more compelling if recent curl
were hard to compile on old systems. I don't know whether that's the
case (certainly on a modern system, it's much easier to get newer
versions of curl to compile than older ones).



I have no experience with building curl on older Linux systems. I know 
that I can build it on old Solaris releases but that is not

Re: [PATCH 0/4] dropping support for older curl

2017-08-10 Thread Junio C Hamano

Jeff King  writes:

> On Thu, Aug 10, 2017 at 11:36:41AM +0200, Johannes Schindelin wrote:
>
>> Hopefully I had better luck expressing my concerns this time?
>
> I understand your argument much better now. I'm still not sure I agree.
>
> -Peff

I do not think "there are a dozen #ifdefs and I don't know whether
they still work. I don't know whether anybody (who most likely has
better things to do than read the Git mailing list) is still using
those.  So let's just remove them." was why you were suggesting to
clean up the (apparent) support of older curl in the code, though.

Isn't the reason why your series simplifies these #ifdefs away
because we by accident started using some features that require a
version that is even newer than any of these #ifdef's try to cater
to and yet nobody complained?  That is a lot more similar to the
removal of rsync transport that happened in a not so distant past,
where the reason for removal was "We have been shipping code that
couldn't have possibly worked for some time and nobody complained
---we know nobody is depending on it."

Or "We accidentally started shipping code with comma after the last
element of enum decl and nobody compalined---everybody's compiler
must be ready" ;-)

Re: [PATCH 0/4] dropping support for older curl

On Thu, Aug 10, 2017 at 11:36:41AM +0200, Johannes Schindelin wrote:

> Hopefully I had better luck expressing my concerns this time?

I understand your argument much better now. I'm still not sure I agree.

-Peff

Re: [PATCH 0/4] dropping support for older curl

On Thu, Aug 10, 2017 at 10:33:18PM +0200, Tom G. Christensen wrote:

> > You've totally ignored the argument I made back then[1], and which I
> > reiterated in this thread. So I'll say it one more time: the more
> > compelling reason is not the #ifdefs, but the fact that the older
> > versions are totally untested.
> 
> Perhaps you forgot but I stated in the original thread that I build RPMS for
> RHEL/CentOS 3, 4, 5, 6 and 7. I still do and I run the testsuite every
> single time.

I didn't forget. I actually double-checked the patches you sent at the
time, but I didn't see one for the CURLPROTO issue. And indeed, it is
still broken for me:

  $ cd /path/to/curl/repo
  $ git checkout curl-7_15_5
  $ ./buildconf && ./configure --prefix=/tmp/foo && make install
  $ cd /path/to/git
  $ git checkout v2.14.0
  $ make CURLDIR=/tmp/foo V=1 http.o
  gcc -o http.o -c -MF ./.depend/http.o.d -MQ http.o -MMD -MP   -g -O0 -Wall 
-Werror -Wdeclaration-after-statement -Wpointer-arith -Wstrict-prototypes -Wvla 
-Wold-style-declaration -Wold-style-definition -Wno-error -Wno-cpp 
-Wno-unused-value -Wno-strict-prototypes  -I. -DUSE_LIBPCRE1 -DHAVE_ALLOCA_H 
-I/tmp/foo/include -DUSE_CURL_FOR_IMAP_SEND -DNO_GETTEXT -DSHA1_DC 
-DSHA1DC_NO_STANDARD_INCLUDES -DSHA1DC_INIT_SAFE_HASH_DEFAULT=0 
-DSHA1DC_CUSTOM_INCLUDE_SHA1_C="\"cache.h\"" 
-DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_C="\"sha1dc_git.c\"" 
-DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_H="\"sha1dc_git.h\"" 
-DSHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C="\"git-compat-util.h\""  -DHAVE_PATHS_H 
-DHAVE_DEV_TTY -DHAVE_CLOCK_GETTIME -DHAVE_CLOCK_MONOTONIC -DHAVE_GETDELIM  
-DFREAD_READS_DIRECTORIES -DNO_STRLCPY -DSHELL_PATH='"/bin/sh"' 
-DPAGER_ENV='"LESS=FRX LV=-c"'  http.c
  http.c: In function ‘get_curl_allowed_protocols’:
  http.c:685:24: error: ‘CURLPROTO_HTTP’ undeclared (first use in this 
function); did you mean ‘CURLPROXY_HTTP’?
 allowed_protocols |= CURLPROTO_HTTP;
  ^~
  CURLPROXY_HTTP
  [and so on]

> I just built a pristine 2.14.0 on CentOS 5 with curl 7.15.5. No problems at
> all neither with building nor with running the testsuite.

As you can see, this does not compile for me. What's going on?

I don't see how it could work, as CURLPROTO_HTTP is not defined at all
in that version of curl.  Can you please double-check that you're
building against the correct version of curl, and that you are building
the HTTP parts of Git (which _are_ optional, and the test suite will
pass without them).

> > So IMHO this is about being honest with users about which versions we
> > _actually_ support.
> 
> I have no problem with you wanting to drop support for older curl releases
> (such as 7.15.5) but don't use the argument that it doesn't currently build
> and nobody cares.

My argument isn't quite that nobody cares. It's that we do users a
disservice by shipping a version of the code that very well may have
hidden problems like security holes (for instance, we do not handle
redirects safely in old versions of curl). So if you can get it to build
it may _seem_ fine, but it's a bit of a booby-trap waiting to spring.

I also won't claim any absolutes. I think we all agree this is a
cost/benefit tradeoff. But there are a lot of options for building on a
very old system. For instance, building without http if you don't need
it. Or building a more recent libcurl (and even linking statically for
simplicity).

I'd find arguments against the latter more compelling if recent curl
were hard to compile on old systems. I don't know whether that's the
case (certainly on a modern system, it's much easier to get newer
versions of curl to compile than older ones).

> Also FWIW Red Hat continues to support RHEL 5 with the Extended Life-cycle
> Support program until 2020-11-30.

I saw that, too. But as I understand it, they provide no code updates:
no bugfixes and no security updates. They just promise to answer the
phone and help you with troubleshooting. It's possible my perception is
wrong, though; I'm certainly not one of their customers.

-Peff

Re: [PATCH 0/4] dropping support for older curl


On 09/08/17 23:47, Jeff King wrote:

On Wed, Aug 09, 2017 at 11:42:12PM +0200, Johannes Schindelin wrote:

I mean, if we even go out of our way to support the completely outdated
and obsolete .git/branches/ for what is likely a single user, it may not
be the worst to keep those couple of #ifdef guards to keep at least
nominal support for older cURLs?


You've totally ignored the argument I made back then[1], and which I
reiterated in this thread. So I'll say it one more time: the more
compelling reason is not the #ifdefs, but the fact that the older
versions are totally untested. 


Perhaps you forgot but I stated in the original thread that I build RPMS 
for RHEL/CentOS 3, 4, 5, 6 and 7. I still do and I run the testsuite 
every single time.

I currently have 2.13.3 up for el4, el5, el6 and el7.
Only el4 requires any patches, the rest will build out of the box with 
the vendor supplied version of curl.


The plan was to drop the el4 builds for 2.14.0 to get rid of the patches.


In fact, they do not even compile, and
yet I have not seen any patches to fix that.



I just built a pristine 2.14.0 on CentOS 5 with curl 7.15.5. No problems 
at all neither with building nor with running the testsuite.



So IMHO this is about being honest with users about which versions we
_actually_ support.



I have no problem with you wanting to drop support for older curl 
releases (such as 7.15.5) but don't use the argument that it doesn't 
currently build and nobody cares.


Also FWIW Red Hat continues to support RHEL 5 with the Extended 
Life-cycle Support program until 2020-11-30.


-tgc

Re: [PATCH 0/4] dropping support for older curl