Re: [RFC PATCH 3/3] test-lib: add the '--stress' option to run a test repeatedly under load

2018-12-06 Thread SZEDER Gábor
On Wed, Dec 05, 2018 at 04:56:21PM -0500, Jeff King wrote:
> Could we just kill them all?
> 
> I guess it's a little tricky, because $! is going to give us the pid of
> each subshell. We actually want to kill the test sub-process. That takes
> a few contortions, but the output is nice (every sub-job immediately
> says "ABORTED ...", and the final process does not exit until the whole
> tree is done):

It's trickier than that:

  - Nowadays our test lib translates SIGINT to exit, but not TERM (or
HUP, in case a user decides to close the terminal window), which
means that if the test runs any daemons in the background, then
those won't be killed when the stress test is aborted.

This is easy enough to address, and I've been meaning to do this
in an other patch series anyway.

  - With the (implied) '--verbose-log' the process killed in the
background subshell's SIGTERM handler is not the actual test
process, because '--verbose-log' runs the test again in a piped
subshell to save its output:

  (GIT_TEST_TEE_STARTED=done ${TEST_SHELL_PATH} "$0" "$@" 2>&1;
   echo $? >"$TEST_RESULTS_BASE.exit") | tee -a "$GIT_TEST_TEE_OUTPUT_FILE"

That 'kill' kills the "outer" shell process.
And though I get "ABORTED..." immediately and I get back my
prompt right away, the commands involved in the above construct
are still running in the background:

  $ ./t3903-stash.sh --stress=1
  ^CABORTED  0.0
  $ ps a |grep t3903
  1280 pts/17   S  0:00 /bin/sh ./t3903-stash.sh --stress=1
  1281 pts/17   S  0:00 tee -a 
<...>/test-results/t3903-stash.stress-0.out
  1282 pts/17   S  0:00 /bin/sh ./t3903-stash.sh --stress=1
  4173 pts/17   S+ 0:00 grep t3903

I see this behavior with all shells I tried.
I haven't yet started to think it through why this happens :)

Not implying '--verbose-log' but redirecting the test script's
output, like you did in your 'stress' script, seems to work in
dash, ksh, and ks93, but not in Bash v4.3 or later, where, for
whatever reason, the subshell get SIGINT before the SIGTERM would
arrive.
While we could easily handle SIGINT in the subshell with the same
signal handler as SIGTERM, it bothers me that something
fundamental is wrong here.
Furthermore, then we should perhaps forbid '--stress' in
combination with '--verbose-log' or '--tee'.


> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 9b7f687396..357dead3ff 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -98,8 +98,9 @@ done,*)
>   mkdir -p "$(dirname "$TEST_RESULTS_BASE")"
>   stressfail="$TEST_RESULTS_BASE.stress-failed"
>   rm -f "$stressfail"
> - trap 'echo aborted >"$stressfail"' TERM INT HUP
> + trap 'echo aborted >"$stressfail"; kill $job_pids 2>/dev/null; wait' 
> TERM INT HUP
>  
> + job_pids=
>   job_nr=0
>   while test $job_nr -lt "$job_count"
>   do
> @@ -108,10 +109,13 @@ done,*)
>   GIT_TEST_STRESS_JOB_NR=$job_nr
>   export GIT_TEST_STRESS_STARTED GIT_TEST_STRESS_JOB_NR
>  
> + trap 'kill $test_pid 2>/dev/null' TERM
> +
>   cnt=0
>   while ! test -e "$stressfail"
>   do
> - if $TEST_SHELL_PATH "$0" "$@" >/dev/null 2>&1
> + $TEST_SHELL_PATH "$0" "$@" >/dev/null 2>&1 & 
> test_pid=$!
> + if wait
>   then
>   printf >&2 "OK   %2d.%d\n" 
> $GIT_TEST_STRESS_JOB_NR $cnt
>   elif test -f "$stressfail" &&
> @@ -124,16 +128,11 @@ done,*)
>   fi
>   cnt=$(($cnt + 1))
>   done
> - ) &
> + ) & job_pids="$job_pids $!"
>   job_nr=$(($job_nr + 1))
>   done
>  
> - job_nr=0
> - while test $job_nr -lt "$job_count"
> - do
> - wait
> - job_nr=$(($job_nr + 1))
> - done
> + wait
>  
>   if test -f "$stressfail" && test "$(cat "$stressfail")" != "aborted"
>   then
> 



Re: [RFC PATCH 3/3] test-lib: add the '--stress' option to run a test repeatedly under load

2018-12-06 Thread SZEDER Gábor
On Wed, Dec 05, 2018 at 04:36:26PM -0500, Jeff King wrote:
> The signal interrupts the first wait.

Ah, of course.  I'm ashamed to say that this is not the first time I
forget about that...

> > Bash 4.3 or later are strange: I get back the shell prompt immediately
> > after ctrl-C as well, so it doesn't appear to be waiting for all
> > remaining jobs to finish either, but! I don't get any of the progress
> > output from those jobs to mess up my next command.
> 
> Interesting. My bash 4.4 seems to behave the same as dash. It almost
> sounds like the SIGINT is getting passed to the subshells for you.
> Probably not really worth digging into, though.

The subshell does indeed get SIGINT.  I don't know why that happens,
to my understanding that should not happen.

> In case anybody is playing with this and needed to simulate a stress
> failure, here's what I used:
> 
> diff --git a/t/t-basic.sh b/t/t-basic.sh
> index b6566003dd..a370cd9977 100755
> --- a/t/t-basic.sh
> +++ b/t/t-basic.sh
> @@ -1171,4 +1171,11 @@ test_expect_success 'very long name in the index 
> handled sanely' '
>   test $len = 4098
>  '
>  
> +test_expect_success 'roll those dice' '
> + case "$(openssl rand -base64 1)" in
> + z*)
> + return 1
> + esac
> +'

Wasteful :)

  test $(($$ % 42)) -ne 0



Re: [RFC PATCH 3/3] test-lib: add the '--stress' option to run a test repeatedly under load

2018-12-05 Thread SZEDER Gábor
On Wed, Dec 05, 2018 at 03:01:41PM +0100, Ævar Arnfjörð Bjarmason wrote:
> >> decide to stress test in advance, since we'd either flock() the trash
> >> dir, or just mktemp(1)-it.
> >
> > While 'mktemp' seems to be more portable than 'flock', it doesn't seem
> > to be portable enough; at least it's not in POSIX.>
> 
> We are already relying on stuff like mktemp() being reliable for
> e.g. the split index.

That's the mktemp() function from libc; I meant the 'mktemp' command
that we would use this early in 'test-lib.sh', where PATH has not been
set up for testing yet.



Re: [RFC PATCH 3/3] test-lib: add the '--stress' option to run a test repeatedly under load

2018-12-05 Thread SZEDER Gábor
On Wed, Dec 05, 2018 at 12:44:09AM -0500, Jeff King wrote:
> On Tue, Dec 04, 2018 at 05:34:57PM +0100, SZEDER Gábor wrote:
> 
> > To prevent the several parallel invocations of the same test from
> > interfering with each other:
> > 
> >   - Include the parallel job's number in the name of the trash
> > directory and the various output files under 't/test-results/' as
> > a '.stress-' suffix.
> 
> Makes sense.
> 
> >   - Add the parallel job's number to the port number specified by the
> > user or to the test number, so even tests involving daemons
> > listening on a TCP socket can be stressed.
> 
> In my script I just use an arbitrary sequence of ports. That means two
> stress runs may stomp on each other, but stress runs tend not to
> interfere with regular runs (whereas with your scheme, a stress run of
> t5561 is going to stomp on t5562). It probably doesn't matter much
> either way, as I tend not to do too much with the machine during a
> stress run.

A custom port can always be set via environment variables, though the
user has to remember to set it (I doubt that I would remember it until
reminded by test failures...)

> >   - Make '--stress' imply '--verbose-log' and discard the test's
> > standard ouput and error; dumping the output of several parallel
> > tests to the terminal would create a big ugly mess.
> 
> Makes sense. My script just redirects the output to a file, but it
> predates --verbose-log.
> 
> My script always runs with "-x". I guess it's not that hard to add it if
> you want, but it is annoying to finally get a failure after several
> minutes only to realize that your are missing some important
> information. ;)
> 
> Ditto for "-i", which leaves more readable output (you know the
> interesting part is at the end of the log), and leaves a trash directory
> state that is more likely to be useful for inspecting.

I wanted to imply only the bare minimum of options that are necessary
to prevent the tests from stomping on each other.  Other than that I
agree and wouldn't run '--stress' without '-x -i' myself, and at one
point considered to recommend '-x -i' in the description of
'--stress'.

> My script also implies "--root". That's not strictly necessary, though I
> suspect it is much more likely to catch races when it's run on a ram
> disk (simply because it leaves the CPU as the main bottleneck).
> 
> > 'wait' for all parallel jobs before exiting (either because a failure
> > was found or because the user lost patience and aborted the stress
> > test), allowing the still running tests to finish.  Otherwise the "OK
> > X.Y" progress output from the last iteration would likely arrive after
> > the user got back the shell prompt, interfering with typing in the
> > next command.  OTOH, this waiting might induce a considerable delay
> > between hitting ctrl-C and the test actually exiting; I'm not sure
> > this is the right tradeoff.
> 
> If there is a delay, it's actually quite annoying to _not_ wait; you
> start doing something in the terminal, and then a bunch of extra test
> statuses print at a random time.

At least the INT/TERM/etc. handler should say something like "Waiting
for background jobs to finish", to let the user know why it doesn't
exit right away.

An alternative would be to exit right away, and make the background
loops a tad bit smarter to not print these progress lines when
aborting.

> > +   job_nr=0
> > +   while test $job_nr -lt "$job_count"
> > +   do
> > +   (
> > +   GIT_TEST_STRESS_STARTED=done
> > +   GIT_TEST_STRESS_JOB_NR=$job_nr
> > +   export GIT_TEST_STRESS_STARTED GIT_TEST_STRESS_JOB_NR
> > +
> > +   cnt=0
> > +   while ! test -e "$stressfail"
> > +   do
> > +   if $TEST_SHELL_PATH "$0" "$@" >/dev/null 2>&1
> > +   then
> > +   printf >&2 "OK   %2d.%d\n" 
> > $GIT_TEST_STRESS_JOB_NR $cnt
> 
> OK, this adds a counter for each job number (compared to my script).
> Seems helpful.
> 
> > +   elif test -f "$stressfail" &&
> > +test "$(cat "$stressfail")" = "aborted"
> > +   then
> > +   printf >&2 "ABORTED %2d.%d\n" 
> > $GIT_TEST_STRESS_JOB_NR $cnt
> > +   else
> > +   

Re: [PATCH 2/3] test-lib-functions: introduce the 'test_set_port' helper function

2018-12-05 Thread SZEDER Gábor
On Wed, Dec 05, 2018 at 12:17:09AM -0500, Jeff King wrote:
> On Tue, Dec 04, 2018 at 05:34:56PM +0100, SZEDER Gábor wrote:
> 
> > Several test scripts run daemons like 'git-daemon' or Apache, and
> > communicate with them through TCP sockets.  To have unique ports where
> > these daemons are accessible, the ports are usually the number of the
> > corresponding test scripts, unless the user overrides them via
> > environment variables, and thus all those tests and test libs contain
> > more or less the same bit of one-liner boilerplate code to find out
> > the port.  The last patch in this series will make this a bit more
> > complicated.
> > 
> > Factor out finding the port for a daemon into the common helper
> > function 'test_set_port' to avoid repeating ourselves.
> 
> OK. Looks like this should keep the same port numbers for all of the
> existing tests, which I think is good.

Not for all existing tests, though: t0410 and the 'git p4' tests do
get different ports.


Hrm, speaking of affecting 'git p4' tests...  I should have put Luke
on Cc, so doing it now.



Re: [RFC PATCH 3/3] test-lib: add the '--stress' option to run a test repeatedly under load

2018-12-05 Thread SZEDER Gábor
On Tue, Dec 04, 2018 at 07:11:08PM +0100, Ævar Arnfjörð Bjarmason wrote:
> It's a frequent annoyance of mine in the test suite that I'm
> e.g. running t*.sh with some parallel "prove" in one screen, and then I
> run tABCD*.sh manually, and get unlucky because they use the same trash
> dir, and both tests go boom.
> 
> You can fix that with --root, which is much of what this patch does. My
> one-liner for doing --stress has been something like:
> 
> perl -E 'say ++$_ while 1' | parallel --jobs=100% --halt-on-error 
> soon,fail=1 './t-basic.sh --root=trash-{} -v'
> 
> But it would be great if I didn't have to worry about that and could
> just run two concurrent:
> 
> ./t-basic.sh
> 
> So I think we could just set some env variable where instead of having
> the predictable trash directory we have a $TRASHDIR.$N as this patch
> does, except we pick $N by locking some test-runs/tABCD.Nth file with
> flock() during the run.

Is 'flock' portable?  It doesn't appear so.

> Then a stress mode like this would just be:
> 
> GIT_TEST_FLOCKED_TRASH_DIRS=1 perl -E 'say ++$_ while 1' | parallel 
> --jobs=100% --halt-on-error soon,fail=1 './t-basic.sh'

This doesn't address the issue of TCP ports for various daemons.

> And sure, we could ship a --stress option too, but it wouldn't be
> magical in any way, just another "spawn N in a loop" implementation, but
> you could also e.g. use GNU parallel to drive it, and without needing to

GNU 'parallel' may or may not be available on the platform, that's why
I stuck with the barebones shell-loops-in-the-background approach.

> decide to stress test in advance, since we'd either flock() the trash
> dir, or just mktemp(1)-it.

While 'mktemp' seems to be more portable than 'flock', it doesn't seem
to be portable enough; at least it's not in POSIX.

And in general I'd prefer deterministic trash directory names.  If I
re-run a failed test after fixing the bug, then currently the trash
directory will be overwritten, and that's good.  With 'mktemp's junk
in the trash direcory's name it won't be overwritten, and if my bugfix
doesn't work, then I'll start accumulating trash directories and won't
even know which one is from the last run.



Re: [RFC PATCH 3/3] test-lib: add the '--stress' option to run a test repeatedly under load

2018-12-05 Thread SZEDER Gábor


Just a quick reply to this one point for now:

On Wed, Dec 05, 2018 at 12:44:09AM -0500, Jeff King wrote:
> On Tue, Dec 04, 2018 at 05:34:57PM +0100, SZEDER Gábor wrote:
> > +   job_nr=0
> > +   while test $job_nr -lt "$job_count"
> > +   do
> > +   wait
> > +   job_nr=$(($job_nr + 1))
> > +   done
> 
> Do we need to loop? Calling "wait" with no arguments should wait for all
> children.

It should, but in dash, ksh, ksh93, Bash v4.2 or older it doesn't seem
to do so, at least not on my machine, and I get back my shell prompt
right after hitting ctrl-C or the first failed test, and then get the
progress output from the remaining jobs later.

Bash 4.3 or later are strange: I get back the shell prompt immediately
after ctrl-C as well, so it doesn't appear to be waiting for all
remaining jobs to finish either, but! I don't get any of the progress
output from those jobs to mess up my next command.

And mksh and zsh can't run our tests, and I don't have any more shells
at hand to try.



Re: [RFC PATCH 3/3] test-lib: add the '--stress' option to run a test repeatedly under load

2018-12-04 Thread SZEDER Gábor
On Tue, Dec 04, 2018 at 06:04:14PM +0100, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Dec 04 2018, SZEDER Gábor wrote:
> 
> > The number of parallel invocations is determined by, in order of
> > precedence: the number specified as '--stress=', or the value of
> > the GIT_TEST_STRESS_LOAD environment variable, or twice the number of
> > available processors in '/proc/cpuinfo', or 8.
> 
> With this series we have at least 3 ways to get this number. First
> online_cpus() in the C code, then Peff's recent `getconf
> _NPROCESSORS_ONLN` in doc-diff, and now this /proc/cpuinfo parsing.
> 
> Perhaps it makes sense to split online_cpus() into a helper to use from
> the shellscripts instead?

I don't think so, but I will update this patch to use 'getconf'
instead.



[PATCH 1/3] test-lib: consolidate naming of test-results paths

2018-12-04 Thread SZEDER Gábor
There are two places where we strip off any leading path components
and the '.sh' suffix from the test script's pathname, and there are
two places where we construct the filename of test output files in
't/test-results/'.  The last patch in this series will add even more.

Factor these out into helper variables to avoid repeating ourselves.

Signed-off-by: SZEDER Gábor 
---
 t/test-lib.sh | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 0f1faa24b2..49e4563405 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -71,6 +71,9 @@ then
exit 1
 fi
 
+TEST_NAME="$(basename "$0" .sh)"
+TEST_RESULTS_BASE="$TEST_OUTPUT_DIRECTORY/test-results/$TEST_NAME"
+
 # if --tee was passed, write the output not only to the terminal, but
 # additionally to the file test-results/$BASENAME.out, too.
 case "$GIT_TEST_TEE_STARTED, $* " in
@@ -78,12 +81,11 @@ done,*)
# do not redirect again
;;
 *' --tee '*|*' --va'*|*' -V '*|*' --verbose-log '*)
-   mkdir -p "$TEST_OUTPUT_DIRECTORY/test-results"
-   BASE="$TEST_OUTPUT_DIRECTORY/test-results/$(basename "$0" .sh)"
+   mkdir -p "$(dirname "$TEST_RESULTS_BASE")"
 
# Make this filename available to the sub-process in case it is using
# --verbose-log.
-   GIT_TEST_TEE_OUTPUT_FILE=$BASE.out
+   GIT_TEST_TEE_OUTPUT_FILE=$TEST_RESULTS_BASE.out
export GIT_TEST_TEE_OUTPUT_FILE
 
# Truncate before calling "tee -a" to get rid of the results
@@ -91,8 +93,8 @@ done,*)
>"$GIT_TEST_TEE_OUTPUT_FILE"
 
(GIT_TEST_TEE_STARTED=done ${TEST_SHELL_PATH} "$0" "$@" 2>&1;
-echo $? >"$BASE.exit") | tee -a "$GIT_TEST_TEE_OUTPUT_FILE"
-   test "$(cat "$BASE.exit")" = 0
+echo $? >"$TEST_RESULTS_BASE.exit") | tee -a 
"$GIT_TEST_TEE_OUTPUT_FILE"
+   test "$(cat "$TEST_RESULTS_BASE.exit")" = 0
exit
;;
 esac
@@ -818,12 +820,9 @@ test_done () {
 
if test -z "$HARNESS_ACTIVE"
then
-   test_results_dir="$TEST_OUTPUT_DIRECTORY/test-results"
-   mkdir -p "$test_results_dir"
-   base=${0##*/}
-   test_results_path="$test_results_dir/${base%.sh}.counts"
+   mkdir -p "$(dirname "$TEST_RESULTS_BASE")"
 
-   cat >"$test_results_path" <<-EOF
+   cat >"$TEST_RESULTS_BASE.counts" <<-EOF
total $test_count
success $test_success
fixed $test_fixed
@@ -1029,7 +1028,7 @@ then
 fi
 
 # Test repository
-TRASH_DIRECTORY="trash directory.$(basename "$0" .sh)"
+TRASH_DIRECTORY="trash directory.$TEST_NAME"
 test -n "$root" && TRASH_DIRECTORY="$root/$TRASH_DIRECTORY"
 case "$TRASH_DIRECTORY" in
 /*) ;; # absolute path is good
-- 
2.20.0.rc2.156.g5a9fd2ce9c



[RFC PATCH 3/3] test-lib: add the '--stress' option to run a test repeatedly under load

2018-12-04 Thread SZEDER Gábor
Unfortunately, we have a few flaky tests, whose failures tend to be
hard to reproduce.  We've found that the best we can do to reproduce
such a failure is to run the test repeatedly while the machine is
under load, and wait in the hope that the load creates enough variance
in the timing of the test's commands that a failure is evenually
triggered.  I have a command to do that, and I noticed that two other
contributors have rolled their own scripts to do the same, all
choosing slightly different approaches.

To help reproduce failures in flaky tests, introduce the '--stress'
option to run a test script repeatedly in multiple parallel
invocations until one of them fails, thereby using the test script
itself to increase the load on the machine.

The number of parallel invocations is determined by, in order of
precedence: the number specified as '--stress=', or the value of
the GIT_TEST_STRESS_LOAD environment variable, or twice the number of
available processors in '/proc/cpuinfo', or 8.

To prevent the several parallel invocations of the same test from
interfering with each other:

  - Include the parallel job's number in the name of the trash
directory and the various output files under 't/test-results/' as
a '.stress-' suffix.

  - Add the parallel job's number to the port number specified by the
user or to the test number, so even tests involving daemons
listening on a TCP socket can be stressed.

  - Make '--stress' imply '--verbose-log' and discard the test's
standard ouput and error; dumping the output of several parallel
tests to the terminal would create a big ugly mess.

'wait' for all parallel jobs before exiting (either because a failure
was found or because the user lost patience and aborted the stress
test), allowing the still running tests to finish.  Otherwise the "OK
X.Y" progress output from the last iteration would likely arrive after
the user got back the shell prompt, interfering with typing in the
next command.  OTOH, this waiting might induce a considerable delay
between hitting ctrl-C and the test actually exiting; I'm not sure
this is the right tradeoff.

Based on Jeff King's 'stress' script.

Signed-off-by: SZEDER Gábor 
---
 t/README| 13 ++-
 t/test-lib-functions.sh |  7 +++-
 t/test-lib.sh   | 82 +++--
 3 files changed, 96 insertions(+), 6 deletions(-)

diff --git a/t/README b/t/README
index 28711cc508..9851de25c2 100644
--- a/t/README
+++ b/t/README
@@ -186,6 +186,16 @@ appropriately before running "make".
this feature by setting the GIT_TEST_CHAIN_LINT environment
variable to "1" or "0", respectively.
 
+--stress::
+--stress=::
+   Run the test script repeatedly in multiple parallel
+   invocations until one of them fails.  Useful for reproducing
+   rare failures in flaky tests.  The number of parallel
+   invocations is, in order of precedence: , or the value of
+   the GIT_TEST_STRESS_LOAD environment variable, or twice the
+   number of available processors in '/proc/cpuinfo', or 8.
+   Implies `--verbose-log`.
+
 You can also set the GIT_TEST_INSTALLED environment variable to
 the bindir of an existing git installation to test that installation.
 You still need to have built this git sandbox, from which various
@@ -425,7 +435,8 @@ This test harness library does the following things:
  - Creates an empty test directory with an empty .git/objects database
and chdir(2) into it.  This directory is 't/trash
directory.$test_name_without_dotsh', with t/ subject to change by
-   the --root option documented above.
+   the --root option documented above, and a '.stress-' suffix
+   appended by the --stress option.
 
  - Defines standard test helper functions for your scripts to
use.  These functions are designed to make all scripts behave
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index d9a602cd0f..9af11e3eed 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1288,8 +1288,6 @@ test_set_port () {
# root-only port, use a larger one instead.
port=$(($port + 1))
fi
-
-   eval $var=$port
;;
*[^0-9]*)
error >&7 "invalid port number: $port"
@@ -1298,4 +1296,9 @@ test_set_port () {
# The user has specified the port.
;;
esac
+
+   # Make sure that parallel '--stress' test jobs get different
+   # ports.
+   port=$(($port + ${GIT_TEST_STRESS_JOB_NR:-0}))
+   eval $var=$port
 }
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 49e4563405..9b7f687396 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -71,8 +71,81 @@ then
exit 1
 fi
 
+TEST_STRESS_SFX="${GIT_TEST_STRESS_JOB_NR:+.stress-$GIT_TEST_STRESS_JOB_NR}"
 TEST_NAME="$(basename "$0" .sh)"

[RFC PATCH 0/3] test-lib: add the '--stress' option to help reproduce occasional failures in flaky tests

2018-12-04 Thread SZEDER Gábor
Inspired by Peff's 'stress' script mentioned in:

  https://public-inbox.org/git/20181122161722.gc28...@sigill.intra.peff.net/

the last patch in this series brings that functionality to our test
library to help to reproduce failures in flaky tests.  So

  ./t1234-foo --stress
  
will run that test script repeatedly in multiple parallel invocations,
in the hope that the increased load creates enough variance in the
timing of the test's commands that a failure is evenually triggered.


SZEDER Gábor (3):
  test-lib: consolidate naming of test-results paths
  test-lib-functions: introduce the 'test_set_port' helper function
  test-lib: add the '--stress' option to run a test repeatedly under
load

 t/README | 13 +-
 t/lib-git-daemon.sh  |  2 +-
 t/lib-git-p4.sh  |  9 +---
 t/lib-git-svn.sh |  2 +-
 t/lib-httpd.sh   |  2 +-
 t/t0410-partial-clone.sh |  1 -
 t/t5512-ls-remote.sh |  2 +-
 t/test-lib-functions.sh  | 39 
 t/test-lib.sh| 99 +++-
 9 files changed, 143 insertions(+), 26 deletions(-)

-- 
2.20.0.rc2.156.g5a9fd2ce9c



[PATCH 2/3] test-lib-functions: introduce the 'test_set_port' helper function

2018-12-04 Thread SZEDER Gábor
Several test scripts run daemons like 'git-daemon' or Apache, and
communicate with them through TCP sockets.  To have unique ports where
these daemons are accessible, the ports are usually the number of the
corresponding test scripts, unless the user overrides them via
environment variables, and thus all those tests and test libs contain
more or less the same bit of one-liner boilerplate code to find out
the port.  The last patch in this series will make this a bit more
complicated.

Factor out finding the port for a daemon into the common helper
function 'test_set_port' to avoid repeating ourselves.

Take special care of test scripts with "low" numbers:

  - Test numbers below 1024 would result in a port that's only usable
as root, so set their port to '1 + test-nr' to make sure it
doesn't interfere with other tests in the test suite.  This makes
the hardcoded port number in 't0410-partial-clone.sh' unnecessary,
remove it.

  - The shell's arithmetic evaluation interprets numbers with leading
zeros as octal values, which means that test number below 1000 and
containing the digits 8 or 9 will trigger an error.  Remove all
leading zeros from the test numbers to prevent this.

Note that the Perforce tests are unlike the other tests involving
daemons in that:

  - 'lib-git-p4.sh' doesn't use the test's number for unique port as
is, but does a bit of additional arithmetic on top [1].

  - The port is not overridable via an environment variable.

With this patch even Perforce tests will use the test's number as
default port, and it will be overridable via the P4DPORT environment
variable.

[1] Commit fc00233071 (git-p4 tests: refactor and cleanup, 2011-08-22)
introduced that "unusual" unique port computation without
explaining why it was necessary (as opposed to simply using the
test number as is).  It seems to be just unnecessary complication,
and in any case that commit came way before the "test nr as unique
port" got "standardized" for other daemons in commits c44132fcf3
(tests: auto-set git-daemon port, 2014-02-10), 3bb486e439 (tests:
auto-set LIB_HTTPD_PORT from test name, 2014-02-10), and
bf9d7df950 (t/lib-git-svn.sh: improve svnserve tests with parallel
make test, 2017-12-01).

Signed-off-by: SZEDER Gábor 
---
 t/lib-git-daemon.sh  |  2 +-
 t/lib-git-p4.sh  |  9 +
 t/lib-git-svn.sh |  2 +-
 t/lib-httpd.sh   |  2 +-
 t/t0410-partial-clone.sh |  1 -
 t/t5512-ls-remote.sh |  2 +-
 t/test-lib-functions.sh  | 36 
 7 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/t/lib-git-daemon.sh b/t/lib-git-daemon.sh
index f98de95c15..41eb1e3ae8 100644
--- a/t/lib-git-daemon.sh
+++ b/t/lib-git-daemon.sh
@@ -28,7 +28,7 @@ then
test_skip_or_die $GIT_TEST_GIT_DAEMON "file system does not support 
FIFOs"
 fi
 
-LIB_GIT_DAEMON_PORT=${LIB_GIT_DAEMON_PORT-${this_test#t}}
+test_set_port LIB_GIT_DAEMON_PORT
 
 GIT_DAEMON_PID=
 GIT_DAEMON_DOCUMENT_ROOT_PATH="$PWD"/repo
diff --git a/t/lib-git-p4.sh b/t/lib-git-p4.sh
index c27599474c..b3be3ba011 100644
--- a/t/lib-git-p4.sh
+++ b/t/lib-git-p4.sh
@@ -53,14 +53,7 @@ time_in_seconds () {
(cd / && "$PYTHON_PATH" -c 'import time; print(int(time.time()))')
 }
 
-# Try to pick a unique port: guess a large number, then hope
-# no more than one of each test is running.
-#
-# This does not handle the case where somebody else is running the
-# same tests and has chosen the same ports.
-testid=${this_test#t}
-git_p4_test_start=9800
-P4DPORT=$((10669 + ($testid - $git_p4_test_start)))
+test_set_port P4DPORT
 
 P4PORT=localhost:$P4DPORT
 P4CLIENT=client
diff --git a/t/lib-git-svn.sh b/t/lib-git-svn.sh
index a8130f9119..f3b478c307 100644
--- a/t/lib-git-svn.sh
+++ b/t/lib-git-svn.sh
@@ -13,6 +13,7 @@ fi
 GIT_DIR=$PWD/.git
 GIT_SVN_DIR=$GIT_DIR/svn/refs/remotes/git-svn
 SVN_TREE=$GIT_SVN_DIR/svn-tree
+test_set_port SVNSERVE_PORT
 
 svn >/dev/null 2>&1
 if test $? -ne 1
@@ -119,7 +120,6 @@ require_svnserve () {
 }
 
 start_svnserve () {
-   SVNSERVE_PORT=${SVNSERVE_PORT-${this_test#t}}
svnserve --listen-port $SVNSERVE_PORT \
 --root "$rawsvnrepo" \
 --listen-once \
diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index a8729f8232..e465116ef9 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -82,7 +82,7 @@ case $(uname) in
 esac
 
 LIB_HTTPD_PATH=${LIB_HTTPD_PATH-"$DEFAULT_HTTPD_PATH"}
-LIB_HTTPD_PORT=${LIB_HTTPD_PORT-${this_test#t}}
+test_set_port LIB_HTTPD_PORT
 
 TEST_PATH="$TEST_DIRECTORY"/lib-httpd
 HTTPD_ROOT_PATH="$PWD"/httpd
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index ba3887f178..0aca8d7588 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -480,7 +480,6 @@ test_expect_success 'gc stops traversal w

Re: [PATCH 5/5] test-lib: add support for GIT_TODO_TESTS

2018-12-04 Thread SZEDER Gábor
On Tue, Nov 27, 2018 at 11:54:45PM +0100, Ævar Arnfjörð Bjarmason wrote:
> As noted in the updated t/README documentation being added here
> setting this new GIT_TODO_TESTS variable is usually better than
> GIT_SKIP_TESTS.

I don't see why this is "usually better".  I get how it can help your
particular use-case described below, but that doesn't mean that it's
"usually better".

> My use-case for this is to get feedback from the CI infrastructure[1]
> about which tests are passing due to fixes that have trickled into
> git.git.
> 
> With the GIT_SKIP_TESTS variable this use-case is painful, you need to
> do an occasional manual run without GIT_SKIP_TESTS set. It's much
> better to use GIT_TODO_TESTS and get a report of passing TODO tests
> from prove(1) at the bottom of the test output. Once those passing
> TODO tests have trickled down to 'master' the relevant glob (set for
> all of master/next/pu) can be removed.

Neither from the commit message nor from the documentation is it clear
to me what the result of 'make test' will be when a test listed in
GIT_TODO_TESTS fails.

> As seen in the "GIT_TODO_TESTS mixed failure" test the lack of
> interaction with existing tests marked as TODO by the test suite
> itself is intentional. There's no need to print out something saying
> they matched GIT_TODO_TESTS if they're already TODO tests.
> 
> 1. https://public-inbox.org/git/875zwm15k2@evledraar.gmail.com/
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  ci/lib-travisci.sh |  2 +-
>  t/README   | 18 +--
>  t/t-basic.sh   | 81 +++---
>  t/test-lib.sh  | 31 +++---
>  4 files changed, 112 insertions(+), 20 deletions(-)
> 
> diff --git a/ci/lib-travisci.sh b/ci/lib-travisci.sh
> index 69dff4d1ec..ad8290bfdb 100755
> --- a/ci/lib-travisci.sh
> +++ b/ci/lib-travisci.sh
> @@ -121,7 +121,7 @@ osx-clang|osx-gcc)
>   # t9810 occasionally fails on Travis CI OS X
>   # t9816 occasionally fails with "TAP out of sequence errors" on
>   # Travis CI OS X
> - export GIT_SKIP_TESTS="t9810 t9816"
> + export GIT_TODO_TESTS="t9810 t9816"

This change is not mentioned in the commit message.

As noted in the hunk's context, these test scripts are not skipped
because they don't work on OSX at all, but because they are flaky.
Consequently, reporting them as "maybe should be un-TODO'd" when they
happen to succeed is pointless and will just lead to confusion, so
this seems to be a case when GIT_TODO_TESTS is actually worse than
GIT_SKIP_TESTS.

If a failing test in GIT_TODO_TESTS makes the whole 'make test' fail,
then this should be most definitely left as GIT_SKIP_TESTS.

>   ;;
>  GIT_TEST_GETTEXT_POISON)
>   export GIT_TEST_GETTEXT_POISON=YesPlease
> diff --git a/t/README b/t/README
> index c03b268813..922b4fb3bf 100644
> --- a/t/README
> +++ b/t/README
> @@ -207,8 +207,19 @@ ideally be reported as bugs and fixed, or guarded by a 
> prerequisite
>  (see "Using test prerequisites" below). But until then they can be
>  skipped.
>  
> -To skip tests, set the GIT_SKIP_TESTS variable. Individual tests can
> -be skipped:
> +To skip tests, set either the GIT_SKIP_TESTS or GIT_TODO_TESTS
> +variables. The difference is that with SKIP the tests won't be run at
> +all, whereas they will be run with TODO, but in success or failure
> +annotated as a TODO test.

This is confusing.  "To skip" a test means that the test is not run at
all.  Now, if GIT_TODO_TESTS were to run the listed tests, then it
definitely won't skip them, so this sentence contradicts the previous
one.

What does "annotated as a TODO test" mean?  Something similar to how
'test_expect_failure' works?

> +It's usually preferrable to use TODO, since the test suite will report
> +those tests that unexpectedly succeed, which may indicate that
> +whatever bug broke the test in the past has been fixed, and the test
> +should be un-TODO'd. There's no such feedback loop with
> +GIT_SKIP_TESTS.
> +
> +The GIT_SKIP_TESTS and GIT_TODO_TESTS take the same values. Individual
> +tests can be skipped:
>  
>  $ GIT_SKIP_TESTS=t9200.8 sh ./t9200-git-cvsexport-commit.sh
>  
> @@ -223,7 +234,8 @@ patterns that tells which tests to skip, and either can 
> match the
>  
>  For an individual test suite --run could be used to specify that
>  only some tests should be run or that some tests should be
> -excluded from a run.
> +excluded from a run. The --run option is a shorthand for setting
> +a GIT_SKIP_TESTS pattern.
>  
>  The argument for --run is a list of individual test numbers or
>  ranges with an optional negation prefix that define what tests in


Re: "git add -p" versus "git add -i", followed by "p"

2018-12-02 Thread SZEDER Gábor
On Sun, Dec 02, 2018 at 11:30:19AM -0500, Robert P. J. Day wrote:
> 
>   testing adding by patch for the very first time (i've just never
> needed this), and reading the "progit" book and reading the man page,
> and the impression i'm getting is that running "git add -p" (going
> straight to patch mode) is supposed to be equivalent to running "git
> add -i", then typing "p" to switch to patch mode.
> 
>   that is most emphatically not what i'm seeing. if i run "git add
> -p", then i get to what i expect -- the patch subsystem:
> 
>   $ git add -p
>   diff --git a/README.asc b/README.asc
>   index fa40bad..840e85b 100644
>   --- a/README.asc
>   +++ b/README.asc
>   @@ -1,3 +1,9 @@
>   +change 1
>   +
>   +
>   +
>   +
>   +
>= Pro Git, Second Edition
> 
>Welcome to the second edition of the Pro Git book.
>   Stage this hunk [y,n,q,a,d,j,J,g,/,e,?]?
> 
> but if i start with "git add -i", there seems to be no way to get to
> patch mode -- certainly "p" doesn't do it. am i stupidly missing
> something trivial? is the explanation misleading or inncomplete?

Worksforme™:

  $ echo "New content" >>README.md 
  $ echo "New content" >>t/README
  $ echo "New content" >>contrib//README
  $ git add -i
 staged unstaged path
1:unchanged+1/-0 README.md
2:unchanged+1/-0 contrib/README
3:unchanged+1/-0 t/README
  
  *** Commands ***
1: status   2: update   3: revert   4: add untracked
5: patch6: diff 7: quit 8: help
  What now> p
 staged unstaged path
1:unchanged+1/-0 README.md
2:unchanged+1/-0 contrib/README
3:unchanged+1/-0 t/README
  Patch update>> 1
 staged unstaged path
  * 1:unchanged+1/-0 README.md
2:unchanged+1/-0 contrib/README
3:unchanged+1/-0 t/README
  Patch update>> 2
 staged unstaged path
  * 1:unchanged+1/-0 README.md
  * 2:unchanged+1/-0 contrib/README
3:unchanged+1/-0 t/README
  Patch update>> 

Here I hit enter.  Did you?

  diff --git a/README.md b/README.md
  index f920a42fad..63dee5cfc3 100644
  --- a/README.md
  +++ b/README.md
  @@ -62,3 +62,4 @@ and the name as (depending on your mood):
   [Documentation/giteveryday.txt]: Documentation/giteveryday.txt
   [Documentation/gitcvs-migration.txt]:
  Documentation/gitcvs-migration.txt
   [Documentation/SubmittingPatches]: Documentation/SubmittingPatches
  +New content
  Stage this hunk [y,n,q,a,d,e,?]? y
  
  diff --git a/contrib/README b/contrib/README
  index 05f291c1f1..2b152dfcff 100644
  --- a/contrib/README
  +++ b/contrib/README
  @@ -41,3 +41,4 @@ submit a patch to create a subdirectory of contrib/
  and put your
   stuff there.
   
   -jc
  +New content
  Stage this hunk [y,n,q,a,d,e,?]? n
  
  *** Commands ***
1: status   2: update   3: revert   4: add untracked
5: patch6: diff 7: quit 8: help
  What now> q
  Bye.
  $ git diff --cached 
  diff --git a/README.md b/README.md
  index f920a42fad..63dee5cfc3 100644
  --- a/README.md
  +++ b/README.md
  @@ -62,3 +62,4 @@ and the name as (depending on your mood):
   [Documentation/giteveryday.txt]: Documentation/giteveryday.txt
   [Documentation/gitcvs-migration.txt]: Documentation/gitcvs-migration.txt
   [Documentation/SubmittingPatches]: Documentation/SubmittingPatches
  +New content
  $


Arguably the documentation could make it clear that the user can
choose multiple files at once, e.g.:

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index c9623854bf..061f9cbb0d 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -317,9 +317,9 @@ add untracked::
 
 patch::
 
-  This lets you choose one path out of a 'status' like selection.
-  After choosing the path, it presents the diff between the index
-  and the working tree file and asks you if you want to stage
+  This lets you choose one or more paths out of a 'status' like selection.
+  After choosing the path(s), it presents the diff between the index
+  and the working tree file(s) and asks you if you want to stage
   the change of each hunk.  You can select one of the following
   options and type return:
 
And perhaps we could have a dedicated menu entry for "I'm done with
selecting paths"?  Dunno; I'm a 'git add -p' user myself.



Re: [PATCH v1] teach git to support a virtual (partially populated) work directory

2018-11-28 Thread SZEDER Gábor
On Tue, Nov 27, 2018 at 02:50:57PM -0500, Ben Peart wrote:

> diff --git a/t/t1092-virtualworkdir.sh b/t/t1092-virtualworkdir.sh
> new file mode 100755
> index 00..0cdfe9b362
> --- /dev/null
> +++ b/t/t1092-virtualworkdir.sh
> @@ -0,0 +1,393 @@
> +#!/bin/sh
> +
> +test_description='virtual work directory tests'
> +
> +. ./test-lib.sh
> +
> +# We need total control of the virtual work directory hook
> +sane_unset GIT_TEST_VIRTUALWORKDIR
> +
> +clean_repo () {
> + rm .git/index &&
> + git -c core.virtualworkdir=false reset --hard HEAD &&
> + git -c core.virtualworkdir=false clean -fd &&
> + touch untracked.txt &&

We would usually run '>untracked.txt' instead, sparing the external
process.

A further nit is that a function called 'clean_repo' creates new
untracked files...

> + touch dir1/untracked.txt &&
> + touch dir2/untracked.txt
> +}
> +
> +test_expect_success 'setup' '
> + mkdir -p .git/hooks/ &&
> + cat > .gitignore <<-\EOF &&

CodingGuidelines suggest no space between redirection operator and
filename.

> + .gitignore
> + expect*
> + actual*
> + EOF
> + touch file1.txt &&
> + touch file2.txt &&
> + mkdir -p dir1 &&
> + touch dir1/file1.txt &&
> + touch dir1/file2.txt &&
> + mkdir -p dir2 &&
> + touch dir2/file1.txt &&
> + touch dir2/file2.txt &&
> + git add . &&
> + git commit -m "initial" &&
> + git config --local core.virtualworkdir true
> +'


> +test_expect_success 'verify files not listed are ignored by git clean -f -x' 
> '
> + clean_repo &&

I find it odd to clean the repo right after setting it up; but then
again, 'clean_repo' not only cleans, but also creates new files.
Perhaps rename it to 'reset_repo'?  Dunno.

> + write_script .git/hooks/virtual-work-dir <<-\EOF &&
> + printf "untracked.txt\0"
> + printf "dir1/\0"
> + EOF
> + mkdir -p dir3 &&
> + touch dir3/untracked.txt &&
> + git clean -f -x &&
> + test -f file1.txt &&

Please use the 'test_path_is_file', ...

> + test -f file2.txt &&
> + test ! -f untracked.txt &&

... 'test_path_is_missing', and ...

> + test -d dir1 &&

... 'test_path_is_dir' helpers, respectively, because they print
informative error messages on failure.

> + test -f dir1/file1.txt &&
> + test -f dir1/file2.txt &&
> + test ! -f dir1/untracked.txt &&
> + test -f dir2/file1.txt &&
> + test -f dir2/file2.txt &&
> + test -f dir2/untracked.txt &&
> + test -d dir3 &&
> + test -f dir3/untracked.txt
> +'


Re: [PATCH] t5562: skip if NO_CURL is enabled

2018-11-28 Thread SZEDER Gábor
On Tue, Nov 20, 2018 at 04:11:08AM -0500, Jeff King wrote:
> On Mon, Nov 19, 2018 at 11:36:08AM -0800, Carlo Arenas wrote:
> 
> > tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in
> > t5562/invoke-with-content-length.pl, while I seem to be getting some
> > sporadic errors in 9 with the following output :
> > 
> > ++ env CONTENT_TYPE=application/x-git-receive-pack-request
> > QUERY_STRING=/repo.git/git-receive-pack
> > 'PATH_TRANSLATED=/home/carenas/src/git/t/trash
> > directory.t5562-http-backend-content-length/.git/git-receive-pack'
> > GIT_HTTP_EXPORT_ALL=TRUE REQUEST_METHOD=POST
> > /home/carenas/src/git/t/t5562/invoke-with-content-length.pl push_body
> > git http-backend
> > ++ verify_http_result '200 OK'
> > ++ grep fatal: act.err
> > Binary file act.err matches
> > ++ return 1
> > error: last command exited with $?=1
> > not ok 9 - push plain
> > 
> > and the following output in act.err (with a 200 in act)
> > 
> > fatal: the remote end hung up unexpectedly
> 
> This bit me today, too, and I can reproduce it by running under my
> stress-testing script.

I saw this a few times on Travis CI as well.

> Curiously, the act.err file also has 54 NUL bytes before the "fatal:"
> message.

I think those NUL bytes come from the file system.

The contents of 'act.err' from the previous test ('fetch gzipped
empty') is usually:

  fatal: request ended in the middle of the gzip stream
  fatal: the remote end hung up unexpectedly

Notice that the length of the first line is 54 bytes (including the
trailing newline).  So I suspect that the following is happening:

  - http-backend in the previous test writes the first line,
  - that test finishes and this one starts,
  - this test truncates 'act.err',
  - and then the still-running http-backend from the previous test
finally writes the second line.

So at this point 'act.err' is empty, but the offset of the fd of the
redirection still open from the previous test is at 54, so the file
system fills those bytes with NULs.


> I tried adding an "strace" to see who was producing that
> output, but I can't seem to get it to fail when running under strace
> (presumably because the timing is quite different, and this likely some
> kind of pipe race).
> 
> -Peff


[PATCH] t/lib-git-daemon: fix signal checking

2018-11-26 Thread SZEDER Gábor
Test scripts checking 'git daemon' stop the daemon with a TERM signal,
and the 'stop_git_daemon' helper checks the daemon's exit status to
make sure that it indeed died because of that signal.

This check is bogus since 03c39b3458 (t/lib-git-daemon: use
test_match_signal, 2016-06-24), for two reasons:

  - Right after killing 'git daemon', 'stop_git_daemon' saves its exit
status in a variable, but since 03c39b3458 the condition checking
the exit status looks at '$?', which at this point is not the exit
status of 'git daemon', but that of the variable assignment, i.e.
it's always 0.

  - The unexpected exit status should abort the whole test script with
'error', but it doesn't, because 03c39b3458 forgot to negate
'test_match_signal's exit status in the condition.

This patch fixes both issues.

Signed-off-by: SZEDER Gábor 
---
 t/lib-git-daemon.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/lib-git-daemon.sh b/t/lib-git-daemon.sh
index edbea2d986..f98de95c15 100644
--- a/t/lib-git-daemon.sh
+++ b/t/lib-git-daemon.sh
@@ -92,7 +92,7 @@ stop_git_daemon() {
kill "$GIT_DAEMON_PID"
wait "$GIT_DAEMON_PID" >&3 2>&4
ret=$?
-   if test_match_signal 15 $?
+   if ! test_match_signal 15 $ret
then
error "git daemon exited with status: $ret"
fi
-- 
2.20.0.rc1.149.g55c2c038c2



Re: t5570 shaky for anyone ?

2018-11-25 Thread SZEDER Gábor
On Sun, Nov 25, 2018 at 09:52:23PM +0100, Torsten Bögershausen wrote:
> After running the  "Git 2.20-rc1" testsuite here on a raspi,
> the only TC that failed was t5570.
> When the "grep" was run on daemon.log, the file was empty (?).
> When inspecting it later, it was filled, and grep would have found
> the "extended.attribute" it was looking for.

I think I saw the same failure on Travis CI already before 2.19, so
it's not a new issue.  Here's the test's verbose output:

  + cat
  + 
  + GIT_OVERRIDE_VIRTUAL_HOST=localhost git -c protocol.version=1 ls-remote 
git://127.0.0.1:5570/interp.git
  b6752e52dd867264d12240028003f21e3e1dccabHEAD
  b6752e52dd867264d12240028003f21e3e1dccabrefs/heads/master
  + cut -d  -f2-
  + grep -i extended.attribute daemon.log
  + test_cmp expect actual
  + diff -u expect actual
  --- expect  2018-06-12 10:06:50.758357927 +
  +++ actual  2018-06-12 10:06:50.774365936 +
  @@ -1,2 +0,0 @@
  -Extended attribute "host": localhost
  -Extended attribute "protocol": version=1
  [10579] Connection from 127.0.0.1:45836
  [10579] Extended attribute "host": localhost
  [10579] Extended attribute "protocol": version=1
  error: last command exited with $?=1
  [10579] Request upload-pack for '/interp.git'
  [10579] Interpolated dir '/usr/src/git/t/trash
  dir.t5570/repo/localhost/interp.git'
  [10462] [10579] Disconnected
  not ok 21 - daemon log records all attributes

The thing is that 'git daemon's log is not written to 'daemon.log'
directly, but it goes through a fifo, which is read by a shell loop,
which then sends all log messages both to 'daemon.log' and to the test
script's standard error.  So there is certainly a race between log
messages going through the fifo and the loop before reaching
'daemon.log' and 'git ls-remote' exiting and 'grep' opening
'daemon.log'.

> The following fixes it, but I am not sure if this is the ideal
> solution.

Currently this is the only test that looks at 'daemon.log', but if we
ever going to add another one, then that will be prone to the same
issue.

I wonder whether it's really that useful to have the daemon log in the
test script's output...  if the log was sent directly to daemon log,
then the window for this race would be smaller, but still not
completely closed.

Anyway, I added Peff to Cc, since he added that whole fifo-shell-loop
thing.

> diff --git a/t/t5570-git-daemon.sh b/t/t5570-git-daemon.sh
> index 7466aad111..e259fee0ed 100755
> --- a/t/t5570-git-daemon.sh
> +++ b/t/t5570-git-daemon.sh
> @@ -192,6 +192,7 @@ test_expect_success 'daemon log records all attributes' '
>   GIT_OVERRIDE_VIRTUAL_HOST=localhost \
>   git -c protocol.version=1 \
>   ls-remote "$GIT_DAEMON_URL/interp.git" &&
> + sleep 1 &&
>   grep -i extended.attribute daemon.log | cut -d" " -f2- >actual &&
>   test_cmp expect actual
>  '
> 
> A slightly better approach may be to use a "sleep on demand":
> 
> + ( grep -i -q extended.attribute daemon.log || sleep 1 ) &&
> 


[PATCH 2/2] commit-graph: don't call write_graph_chunk_large_edges() unnecessarily

2018-11-20 Thread SZEDER Gábor
The optional 'Large Edge List' chunk of the commit graph file stores
parent information for commits with more than two parents.  Since the
chunk is optional, write_commit_graph() looks through all commits to
find those with more than two parents, and then writes the commit
graph file header accordingly, i.e. if there are no such commits, then
there won't be a 'Large Edge List' chunk written, only the three
mandatory chunks.

However, when it comes to writing chunk data, write_commit_graph()
unconditionally invokes write_graph_chunk_large_edges(), even when it
was decided earlier that that chunk won't be written.  Strictly
speaking there is no bug here, because write_graph_chunk_large_edges()
won't write anything because it won't find any commits with more than
two parents, but then it unnecessarily and in vain looks through all
commits once again in search for such commits.

Don't call write_graph_chunk_large_edges() when that chunk won't be
written to spare an unnecessary iteration over all commits.

Signed-off-by: SZEDER Gábor 
---
 commit-graph.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index 7b4e3a02cf..965eb23a7b 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -940,7 +940,8 @@ void write_commit_graph(const char *obj_dir,
write_graph_chunk_fanout(f, commits.list, commits.nr);
write_graph_chunk_oids(f, GRAPH_OID_LEN, commits.list, commits.nr);
write_graph_chunk_data(f, GRAPH_OID_LEN, commits.list, commits.nr);
-   write_graph_chunk_large_edges(f, commits.list, commits.nr);
+   if (num_large_edges)
+   write_graph_chunk_large_edges(f, commits.list, commits.nr);
 
close_commit_graph(the_repository);
finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_FSYNC);
-- 
2.20.0.rc0.134.gf0022f8e60



[PATCH 1/2] commit-graph: rename 'num_extra_edges' variable to 'num_large_edges'

2018-11-20 Thread SZEDER Gábor
The commit graph file format describes an optional 'Large Edge List'
chunk, and the function writing out this chunk is called
write_graph_chunk_large_edges().  Then there are two functions in
'commit-graph.c', namely write_graph_chunk_data() and
write_commit_graph(), which have a local variable called
'num_extra_edges'.

It can be confusing on first sight whether large edges and extra edges
refer to the same thing or not, but they do, so let's rename those
variables to 'num_large_edges'.

Signed-off-by: SZEDER Gábor 
---

I rename these variables to 'num_large_edges', because the commit
graph file format speaks about the 'Large Edge List' chunk.

However, I do find that the term 'extra' makes much more sense and
fits the concept better (i.e. extra commit graph edges resulting from
the extra parents or octopus merges; after a s/extra/large/g the
previous phrase would make no sense), and notice that the term 'large'
doesn't come up in the file format itseld (the chunk's magic is {'E',
'D', 'G', 'E'}, there is no 'L' in there), but only in the
specification text and a couple of variable and function names in the
code.

Would it make sense to do the rename in the other direction?

 commit-graph.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 40c855f185..7b4e3a02cf 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -475,7 +475,7 @@ static void write_graph_chunk_data(struct hashfile *f, int 
hash_len,
 {
struct commit **list = commits;
struct commit **last = commits + nr_commits;
-   uint32_t num_extra_edges = 0;
+   uint32_t num_large_edges = 0;
 
while (list < last) {
struct commit_list *parent;
@@ -507,7 +507,7 @@ static void write_graph_chunk_data(struct hashfile *f, int 
hash_len,
if (!parent)
edge_value = GRAPH_PARENT_NONE;
else if (parent->next)
-   edge_value = GRAPH_OCTOPUS_EDGES_NEEDED | 
num_extra_edges;
+   edge_value = GRAPH_OCTOPUS_EDGES_NEEDED | 
num_large_edges;
else {
edge_value = sha1_pos(parent->item->object.oid.hash,
  commits,
@@ -521,7 +521,7 @@ static void write_graph_chunk_data(struct hashfile *f, int 
hash_len,
 
if (edge_value & GRAPH_OCTOPUS_EDGES_NEEDED) {
do {
-   num_extra_edges++;
+   num_large_edges++;
parent = parent->next;
} while (parent);
}
@@ -761,7 +761,7 @@ void write_commit_graph(const char *obj_dir,
uint32_t chunk_ids[5];
uint64_t chunk_offsets[5];
int num_chunks;
-   int num_extra_edges;
+   int num_large_edges;
struct commit_list *parent;
struct progress *progress = NULL;
 
@@ -871,7 +871,7 @@ void write_commit_graph(const char *obj_dir,
commits.alloc = count_distinct;
ALLOC_ARRAY(commits.list, commits.alloc);
 
-   num_extra_edges = 0;
+   num_large_edges = 0;
for (i = 0; i < oids.nr; i++) {
int num_parents = 0;
if (i > 0 && oideq([i - 1], [i]))
@@ -885,11 +885,11 @@ void write_commit_graph(const char *obj_dir,
num_parents++;
 
if (num_parents > 2)
-   num_extra_edges += num_parents - 1;
+   num_large_edges += num_parents - 1;
 
commits.nr++;
}
-   num_chunks = num_extra_edges ? 4 : 3;
+   num_chunks = num_large_edges ? 4 : 3;
 
if (commits.nr >= GRAPH_PARENT_MISSING)
die(_("too many commits to write graph"));
@@ -916,7 +916,7 @@ void write_commit_graph(const char *obj_dir,
chunk_ids[0] = GRAPH_CHUNKID_OIDFANOUT;
chunk_ids[1] = GRAPH_CHUNKID_OIDLOOKUP;
chunk_ids[2] = GRAPH_CHUNKID_DATA;
-   if (num_extra_edges)
+   if (num_large_edges)
chunk_ids[3] = GRAPH_CHUNKID_LARGEEDGES;
else
chunk_ids[3] = 0;
@@ -926,7 +926,7 @@ void write_commit_graph(const char *obj_dir,
chunk_offsets[1] = chunk_offsets[0] + GRAPH_FANOUT_SIZE;
chunk_offsets[2] = chunk_offsets[1] + GRAPH_OID_LEN * commits.nr;
chunk_offsets[3] = chunk_offsets[2] + (GRAPH_OID_LEN + 16) * commits.nr;
-   chunk_offsets[4] = chunk_offsets[3] + 4 * num_extra_edges;
+   chunk_offsets[4] = chunk_offsets[3] + 4 * num_large_edges;
 
for (i = 0; i <= num_chunks; i++) {
uint32_t chunk_write[3];
-- 
2.20.0.rc0.134.gf0022f8e60



[no subject]

2018-11-20 Thread SZEDER Gábor
On Tue, Nov 20, 2018 at 05:58:00PM +0100, SZEDER Gábor wrote:
> I saw a
> bit of weirdness while at it, and want to look into it, but now I've
> got to go...

So here are two simple patches that address the "Huh?!" moments I had
while looking at the progress output during writing the commit graph
file.  The first is a small cleanup to avoid confusion, but see the
notes attaches, while the second is a bit of an optimization.

SZEDER Gábor (2):
  commit-graph: rename 'num_extra_edges' variable to 'num_large_edges'
  commit-graph: don't call write_graph_chunk_large_edges() unnecessarily

 commit-graph.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

-- 
2.20.0.rc0.134.gf0022f8e60



Re: [PATCH v2 2/6] commit-graph write: add more progress output

2018-11-20 Thread SZEDER Gábor
On Tue, Nov 20, 2018 at 07:50:23PM +, Ævar Arnfjörð Bjarmason wrote:
> Add more progress output to the output already added in
> 7b0f229222 ("commit-graph write: add progress output", 2018-09-17).
> 
> As noted in that commit most of the progress output isn't displayed on
> small repositories, but before this change we'd noticeably hang for
> 2-3 seconds at the end on medium sized repositories such as linux.git.
> 
> Now we'll instead show output like this, and have no human-observable
> point at which we're not producing progress output:
> 
> $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
> Finding commits for commit graph: 6365492, done.
> Computing commit graph generation numbers: 100% (797222/797222), done.
> Writing out commit graph: 2399912, done.
> 
> This "writing out" number is not meant to be meaningful to the user,
> but just to show that we're doing work and the command isn't
> hanging.
> 
> In the current implementation it's approximately 4x the number of
> commits.

"approximately" only, because the current implementation is buggy :)
If done right it's exactly 4x the number of commits.

> As noted in on-list discussion[1] we could add the loops up
> and show percentage progress here, but I don't think it's worth it. It
> would make the implementation more complex and harder to maintain for
> very little gain.

I think that if we can cheaply and accurately figure out the total,
then we should display it, so onlooking users can guesstimate how much
work is still left to be done.

> On a much larger in-house repository I have we'll show (note how we
> also say "Annotating[...]"):
> 
> $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
> Finding commits for commit graph: 50026015, done.
> Annotating commit graph: 21567407, done.
> Computing commit graph generation numbers: 100% (7144680/7144680), done.
> Writing out commit graph: 21434417, done.
> 
> 1. https://public-inbox.org/git/20181120165800.gb30...@szeder.dev/
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  commit-graph.c | 41 -
>  1 file changed, 32 insertions(+), 9 deletions(-)
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index e6d0d7722b..6f6409b292 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -433,7 +433,9 @@ struct tree *get_commit_tree_in_graph(struct repository 
> *r, const struct commit
>  
>  static void write_graph_chunk_fanout(struct hashfile *f,
>struct commit **commits,
> -  int nr_commits)
> +  int nr_commits,
> +  struct progress *progress,
> +  uint64_t *progress_cnt)
>  {
>   int i, count = 0;
>   struct commit **list = commits;
> @@ -445,6 +447,7 @@ static void write_graph_chunk_fanout(struct hashfile *f,
>*/
>   for (i = 0; i < 256; i++) {
>   while (count < nr_commits) {
> + display_progress(progress, ++*progress_cnt);

I think this display_progress() should be places after the condition,
so no one has to waste brain cycles on figuring out, why it always
counts 255 more than the number of commits.

>   if ((*list)->object.oid.hash[0] != i)
>   break;
>   count++;
> @@ -456,12 +459,16 @@ static void write_graph_chunk_fanout(struct hashfile *f,
>  }
>  
>  static void write_graph_chunk_oids(struct hashfile *f, int hash_len,
> -struct commit **commits, int nr_commits)
> +struct commit **commits, int nr_commits,
> +struct progress *progress,
> +uint64_t *progress_cnt)
>  {
>   struct commit **list = commits;
>   int count;
> - for (count = 0; count < nr_commits; count++, list++)
> + for (count = 0; count < nr_commits; count++, list++) {
> + display_progress(progress, ++*progress_cnt);
>   hashwrite(f, (*list)->object.oid.hash, (int)hash_len);
> + }
>  }
>  
>  static const unsigned char *commit_to_sha1(size_t index, void *table)
> @@ -471,7 +478,9 @@ static const unsigned char *commit_to_sha1(size_t index, 
> void *table)
>  }
>  
>  static void write_graph_chunk_data(struct hashfile *f, int hash_len,
> -struct commit **commits, int nr_commits)
> +struct commit **commits, int nr_commits,
> +struct progress *progress,
> +uint64_t *progress_cnt)
>  {
>   struct commit **list = commits;
>   struct commit **last = commits + nr_commits;
> @@ -481,6 +490,7 @@ static void write_graph_chunk_data(struct hashfile *f, 
> int hash_len,
>   struct commit_list *parent;
>   int edge_value;
>   uint32_t 

Re: [PATCH 2/6] commit-graph write: add more progress output

2018-11-20 Thread SZEDER Gábor
On Tue, Nov 20, 2018 at 03:04:39PM +, Ævar Arnfjörð Bjarmason wrote:
> Add more progress output to the output already added in
> 7b0f229222 ("commit-graph write: add progress output", 2018-09-17).
> 
> As noted in that commit most of the progress output isn't displayed on
> small repositories, but before this change we'd noticeably hang for
> 2-3 seconds at the end on medium sized repositories such as linux.git.
> 
> Now we'll instead show output like this, and have no human-observable
> point at which we're not producing progress output:
> 
> $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
> Finding commits for commit graph: 6418991, done.
> Computing commit graph generation numbers: 100% (797205/797205), done.
> Writing out commit graph chunks: 2399861, done.
> 
> This "graph chunks" number is not meant to be meaningful to the user,
> but just to show that we're doing work and the command isn't
> hanging.
> 
> On a much larger in-house repository I have we'll show (note how we
> also say "Annotating[...]"):
> 
> $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
> Finding commits for commit graph: 48271163, done.
> Annotating commit graph: 21424536, done.
> Computing commit graph generation numbers: 100% (7141512/7141512), done.
> Writing out commit graph chunks: 21424913, done.

That's a lot of chunks, but according to the specs, there are only 3
or 4 chunks in a commit-graph file.  More on this below.

> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  commit-graph.c | 47 ++-
>  1 file changed, 38 insertions(+), 9 deletions(-)
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index e6d0d7722b..afce20dd4d 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -433,7 +433,9 @@ struct tree *get_commit_tree_in_graph(struct repository 
> *r, const struct commit
>  
>  static void write_graph_chunk_fanout(struct hashfile *f,
>struct commit **commits,
> -  int nr_commits)
> +  int nr_commits,
> +  struct progress *progress,
> +  uint64_t *progress_cnt)
>  {
>   int i, count = 0;
>   struct commit **list = commits;
> @@ -445,6 +447,8 @@ static void write_graph_chunk_fanout(struct hashfile *f,
>*/
>   for (i = 0; i < 256; i++) {
>   while (count < nr_commits) {
> + if (progress)
> + display_progress(progress, ++*progress_cnt);

The condition is unnecessary, display_progress() is prepared to deal
with a NULL progress pointer.  The same applies to all such calls in
this patch.

>   if ((*list)->object.oid.hash[0] != i)
>   break;
>   count++;
> @@ -456,12 +460,17 @@ static void write_graph_chunk_fanout(struct hashfile *f,
>  }
>  
>  static void write_graph_chunk_oids(struct hashfile *f, int hash_len,
> -struct commit **commits, int nr_commits)
> +struct commit **commits, int nr_commits,
> +struct progress *progress,
> +uint64_t *progress_cnt)
>  {
>   struct commit **list = commits;
>   int count;
> - for (count = 0; count < nr_commits; count++, list++)
> + for (count = 0; count < nr_commits; count++, list++) {
> + if (progress)
> + display_progress(progress, ++*progress_cnt);
>   hashwrite(f, (*list)->object.oid.hash, (int)hash_len);
> + }
>  }
>  
>  static const unsigned char *commit_to_sha1(size_t index, void *table)
> @@ -471,7 +480,9 @@ static const unsigned char *commit_to_sha1(size_t index, 
> void *table)
>  }
>  
>  static void write_graph_chunk_data(struct hashfile *f, int hash_len,
> -struct commit **commits, int nr_commits)
> +struct commit **commits, int nr_commits,
> +struct progress *progress,
> +uint64_t *progress_cnt)
>  {
>   struct commit **list = commits;
>   struct commit **last = commits + nr_commits;
> @@ -482,6 +493,9 @@ static void write_graph_chunk_data(struct hashfile *f, 
> int hash_len,
>   int edge_value;
>   uint32_t packedDate[2];
>  
> + if (progress)
> + display_progress(progress, ++*progress_cnt);
> +
>   parse_commit(*list);
>   hashwrite(f, get_commit_tree_oid(*list)->hash, hash_len);
>  
> @@ -542,7 +556,9 @@ static void write_graph_chunk_data(struct hashfile *f, 
> int hash_len,
>  
>  static void write_graph_chunk_large_edges(struct hashfile *f,
> struct commit **commits,
> -   int 

Re: [PATCH 1/5] eoie: default to not writing EOIE section

2018-11-20 Thread SZEDER Gábor
On Tue, Nov 20, 2018 at 08:06:16AM -0500, Ben Peart wrote:
> >diff --git a/read-cache.c b/read-cache.c
> >index 4ca81286c0..1e9c772603 100644
> >--- a/read-cache.c
> >+++ b/read-cache.c
> >@@ -2689,6 +2689,15 @@ void update_index_if_able(struct index_state *istate, 
> >struct lock_file *lockfile
> > rollback_lock_file(lockfile);
> >  }
> >+static int record_eoie(void)
> >+{
> >+int val;
> 
> I believe you are going to want to initialize val to 0 here as it is on the
> stack so is not guaranteed to be zero.

The git_config_get_bool() call below will initialize it anyway.

> >+
> >+if (!git_config_get_bool("index.recordendofindexentries", ))
> >+return val;
> >+return 0;
> >+}


Re: How to prepare patch for git am which remove a file ?

2018-11-20 Thread SZEDER Gábor
On Tue, Nov 20, 2018 at 01:39:40PM +0100, Mathieu Malaterre wrote:
> Here is a simple setup:
> 
>   cd /tmp
>   mkdir g
>   cd g
>   git init .
>   wget 
> http://central.maven.org/maven2/org/apache/xmlgraphics/fop/2.1/fop-2.1.pom
>   git add fop-2.1.pom
>   git commit -m "My First Commit"
>   git rm fop-2.1.pom
>   git commit -m "Second Commit"
>   git format-patch HEAD~
>   git reset --hard HEAD~
>   git am 0001-Second-Commit.patch
> Applying: Second Commit
> error: patch failed: fop-2.1.pom:1
> error: fop-2.1.pom: patch does not apply
> Patch failed at 0001 Second Commit
> hint: Use 'git am --show-current-patch' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
> 
> What is the black magic to get `git am` to understand this patch ?

The file in question uses CRLF line endings.

  $ git am --keep-cr 0001-Second-Commit.patch
  Applying: Second Commit

For explanation I quote ad2c928001 (git-am: Add command line parameter
`--keep-cr` passing it to git-mailsplit, 2010-02-27):

  c2ca1d7 (Allow mailsplit (and hence git-am) to handle mails with CRLF
  line-endings, 2009-08-04) fixed "git mailsplit" to help people with
  MUA whose output from save-as command uses CRLF as line terminators by
  stripping CR at the end of lines.
  
  However, when you know you are feeding output from "git format-patch"
  directly to "git am", and especially when your contents have CR at the
  end of line, such stripping is undesirable.  To help such a use case,
  teach --keep-cr option to "git am" and pass that to "git mailinfo".



Re: [PATCH] tests: send "bug in the test script" errors to the script's stderr

2018-11-20 Thread SZEDER Gábor
On Tue, Nov 20, 2018 at 05:58:25AM -0500, Jeff King wrote:
> On Tue, Nov 20, 2018 at 05:45:28AM -0500, Jeff King wrote:
> 
> > On Tue, Nov 20, 2018 at 12:34:04AM +0100, SZEDER Gábor wrote:
> > 
> > > > I do notice that many of the existing "FATAL:" errors use descriptor 5,
> > > > which goes to stdout. I'm not sure if those should actually be going to
> > > > stderr (or if there's some TAP significance to those lines).
> > > 
> > > I chose to send these messages to standard error, because they are,
> > > well, errors.  TAP only cares about stdout, but by aborting the test
> > > script in BUG(), error() or die() we are already violating TAP anyway,
> > > so I don't think it matters whether we send "bug in test script" or
> > > "FATAL" errors to stdout or stderr.
> > 
> > Yeah, I agree it probably doesn't matter. I was mostly suggesting to
> > make the existing ">&5" into ">&7" for consistency. But I don't think
> > that needs to block your patch.
> 
> By the way, I did check to see whether this might help the situation
> where running under "prove" we see only "Dubious..." and not the actual
> error() message produced by the test script. But no, it eats both stdout
> and stderr. You can sneak a line through by prepending it with "#", but
> only if "prove -o" is used (and even then, it's hard to associate it
> with a particular test when you're running many in parallel).

Just to be clear: I don't mind if in some combination of test
harnesses and test options a "bug in the test script" message doesn't
reach the terminal as long as I get a clearly visible error from
somewhere.

> So I guess the status quo is not too bad: prove says "dubious", and then
> you re-run the test with "./t1234-foo.sh -v -i" and you get to see the
> error.

And with '--verbose-log' the "bug in the test script" message goes to
the test's log as well, even when it has to go through fd 7 first, so
if you use 'prove' and your GIT_TEST_OPTS includes '--verbose-log'
then you can just look at the log, there's no need to re-run the test.



Re: [PATCH] test-lib-functions: make 'test_cmp_rev' more informative on failure

2018-11-20 Thread SZEDER Gábor
On Mon, Nov 19, 2018 at 02:49:20PM -0500, Jeff King wrote:
> On Mon, Nov 19, 2018 at 02:28:18PM +0100, SZEDER Gábor wrote:
> 
> > The 'test_cmp_rev' helper is merely a wrapper around 'test_cmp'
> > checking the output of two 'git rev-parse' commands, which means that
> > its output on failure is not particularly informative, as it's
> > basically two OIDs with a bit of extra clutter of the diff header, but
> > without any indication of which two revisions have caused the failure:
> > 
> >   --- expect.rev  2018-11-17 14:02:11.569747033 +
> >   +++ actual.rev  2018-11-17 14:02:11.569747033 +
> >   @@ -1 +1 @@
> >   -d79ce1670bdcb76e6d1da2ae095e890ccb326ae9
> >   +139b20d8e6c5b496de61f033f642d0e3dbff528d
> > 
> > It also pollutes the test repo with these two intermediate files,
> > though that doesn't seem to cause any complications in our current
> > tests (meaning that I couldn't find any tests that have to work around
> > the presence of these files by explicitly removing or ignoring them).
> > 
> > Enhance 'test_cmp_rev' to provide a more useful output on failure with
> > less clutter:
> > 
> >   error: two revisions point to different objects:
> > 'HEAD^': d79ce1670bdcb76e6d1da2ae095e890ccb326ae9
> > 'extra': 139b20d8e6c5b496de61f033f642d0e3dbff528d
> > 
> > Doing so is more convenient when storing the OIDs outputted by 'git
> > rev-parse' in a local variable each, which, as a bonus, won't pollute
> > the repository with intermediate files.
> > 
> > While at it, also ensure that 'test_cmp_rev' is invoked with the right
> > number of parameters, namely two.
> 
> This is an improvement, in my opinion (and I agree that using your new
> BUG for this last part would be better still). It also saves a process
> in the common case.

But then it adds two subshells to the common case right away...  I'm
not sure which is worse on Windows, where it matters the most.

> One question:
> 
> > +   else
> > +   local r1 r2
> > +   r1=$(git rev-parse --verify "$1") &&
> > +   r2=$(git rev-parse --verify "$2") &&
> > +   if test "$r1" != "$r2"
> > +   then
> > +   cat >&4 <<-EOF
> > +   error: two revisions point to different objects:
> > + '$1': $r1
> > + '$2': $r2
> > +   EOF
> > +   return 1
> > +   fi
> 
> Why does this cat go to descriptor 4? I get why you'd want it to (it's
> meant for the user's eyes, and that's where 4 goes),

Exactly.

> but we do not
> usually bother to do so for our helper functions (like test_cmp).

test_i18ngrep() does since your 03aa3783f2 (t: send verbose
test-helper output to fd 4, 2018-02-22).

> I don't think it matters usually in practice, because nobody tries to
> capture the stderr of test_cmp, etc.

See 54ce2e9be9 (t9400-git-cvsserver-server: don't rely on the output
of 'test_cmp', 2018-03-08) for a fun example, I remember it caused a
spike in WTF/minutes :)


> I don't think it would ever hurt,
> though. Should we be doing that for all the others, too?

I think we should, and you agreed:

https://public-inbox.org/git/20180303065648.ga17...@sigill.intra.peff.net/

but then went travelling, and then the whole thing got forgotten.



Re: [PATCH] tests: send "bug in the test script" errors to the script's stderr

2018-11-19 Thread SZEDER Gábor
On Mon, Nov 19, 2018 at 02:44:32PM -0500, Jeff King wrote:
> On Mon, Nov 19, 2018 at 02:13:26PM +0100, SZEDER Gábor wrote:
> 
> > Send these "bug in the test script" error messages directly to the
> > test scripts standard error and thus to the terminal, so those bugs
> > will be much harder to overlook.  Instead of updating all ~20 such
> > 'error' calls with a redirection, let's add a BUG() function to
> > 'test-lib.sh', wrapping an 'error' call with the proper redirection
> > and also including the common prefix of those error messages, and
> > convert all those call sites [4] to use this new BUG() function
> > instead.
> 
> Yes, I think this is an improvement.
> 
> > +BUG () {
> > +   error >&7 "bug in the test script: $*"
> > +}
> 
> I naively expected this to go to >&4, but of course that is the
> conditional "stderr or /dev/null, depending on --verbose" descriptor. 

The first version of this patch did send the error message to fd 4 ;)

> I do notice that many of the existing "FATAL:" errors use descriptor 5,
> which goes to stdout. I'm not sure if those should actually be going to
> stderr (or if there's some TAP significance to those lines).

I chose to send these messages to standard error, because they are,
well, errors.  TAP only cares about stdout, but by aborting the test
script in BUG(), error() or die() we are already violating TAP anyway,
so I don't think it matters whether we send "bug in test script" or
"FATAL" errors to stdout or stderr.

BTW, TAP understands "Bail out!" as bail out from the _entire_ test
suite, but that's not what we want here, I'd think.

https://testanything.org/tap-specification.html#bail-out



Re: [PATCH] commit-graph: split up close_reachable() progress output

2018-11-19 Thread SZEDER Gábor
On Mon, Nov 19, 2018 at 08:23:00PM +, Ævar Arnfjörð Bjarmason wrote:
> Amend the progress output added in 7b0f229222 ("commit-graph write:
> add progress output", 2018-09-17) so that the total numbers it reports
> aren't higher than the total number of commits anymore. See [1] for a
> bug report pointing that out.

Please make the commit message more self-contained, i.e. describe the
issue this patch fixes in more detail, so readers won't have to follow
links to understand the problem.

> When I added this I wasn't intending to provide an accurate count, but
> just have some progress output to show the user the command wasn't
> hanging[2]. But since we are showing numbers, let's make them
> accurate. The progress descriptions were suggested by Derrick Stolee
> in [3].
> 
> As noted in [2] we are unlikely to show anything except the "Expanding
> reachable..." message even on fairly large repositories such as
> linux.git. On a test repository I have with north of 7 million commits
> all of these are displayed. Two of them don't show up for long, but as
> noted in [5] future-proofing this for if the loops become more
> expensive in the future makes sense.

In my opinion this is rather one of those "we'll cross that bridge
when (or if ever) we get there" situations.

> 1. https://public-inbox.org/git/20181010203738.ge23...@szeder.dev/
> 2. https://public-inbox.org/git/87pnwhea8y@evledraar.gmail.com/
> 3. 
> https://public-inbox.org/git/f7a0cbee-863c-61d3-4959-5cec8b43c...@gmail.com/
> 4. https://public-inbox.org/git/20181015160545.gg19...@szeder.dev/
> 5. https://public-inbox.org/git/87murle8da@evledraar.gmail.com/
> 
> Reported-by: SZEDER Gábor 
> Helped-by: Derrick Stolee 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
> 
> On Mon, Nov 19 2018, SZEDER Gábor wrote:
> 
> > Ping?
> >
> > We are at -rc0, this progress output is a new feature since v2.19.0,
> > and the numbers shown are still way off.
> 
> I was under the impression after your
> https://public-inbox.org/git/20181015160545.gg19...@szeder.dev/ that
> you were going to do some more digging & report back, so I put it on
> my "waiting for feedback" list and then forgot about it.

No, after I managed to "get my numbers straight" I sent another bug
report in

  https://public-inbox.org/git/20181015165447.gh19...@szeder.dev/

but as a reply to your original patch.  Sorry about the confusion.

> But here's a patch that should address the issue you pointed out, but
> I don't know if it fixes whatever you were alluding to in the linked
> E-Mail above.

I'm afraid this patch doesn't address that issue, as it's limited to
close_reachable(), and that issue is related to the progress output in
add_packed_commits().

>  commit-graph.c | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index 40c855f185..9c0d6914be 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -641,26 +641,29 @@ static void add_missing_parents(struct packed_oid_list 
> *oids, struct commit *com
>  
>  static void close_reachable(struct packed_oid_list *oids, int 
> report_progress)
>  {
> - int i;
> + int i, j;
>   struct commit *commit;
>   struct progress *progress = NULL;
> - int j = 0;
>  
>   if (report_progress)
>   progress = start_delayed_progress(
> - _("Annotating commits in commit graph"), 0);
> + _("Loading known commits in commit graph"), j = 0);
>   for (i = 0; i < oids->nr; i++) {
>   display_progress(progress, ++j);
>   commit = lookup_commit(the_repository, >list[i]);
>   if (commit)
>   commit->object.flags |= UNINTERESTING;
>   }
> + stop_progress();
>  
>   /*
>* As this loop runs, oids->nr may grow, but not more
>* than the number of missing commits in the reachable
>* closure.
>*/
> + if (report_progress)
> + progress = start_delayed_progress(
> + _("Expanding reachable commits in commit graph"), j = 
> 0);
>   for (i = 0; i < oids->nr; i++) {
>   display_progress(progress, ++j);
>   commit = lookup_commit(the_repository, >list[i]);
> @@ -668,7 +671,11 @@ static void close_reachable(struct packed_oid_list 
> *oids, int report_progress)
>   if (commit && !parse_commit(commit))
>   add_missing_parents(oids, commit);
>   }
> + stop_progress();
>  
> + if (report_progress)
> + progress = start_delayed_progress(
> + _("Clearing commit marks in commit graph"), j = 0);
>   for (i = 0; i < oids->nr; i++) {
>   display_progress(progress, ++j);
>   commit = lookup_commit(the_repository, >list[i]);
> -- 
> 2.19.1.1182.g4ecb1133ce
> 


Re: [PATCH v3 1/2] commit-graph write: add progress output

2018-11-19 Thread SZEDER Gábor
Ping?

We are at -rc0, this progress output is a new feature since v2.19.0,
and the numbers shown are still way off.


On Mon, Oct 15, 2018 at 06:54:47PM +0200, SZEDER Gábor wrote:
> On Mon, Sep 17, 2018 at 03:33:35PM +, Ævar Arnfjörð Bjarmason wrote:
> 
> > @@ -560,6 +563,9 @@ static int add_packed_commits(const struct object_id 
> > *oid,
> > off_t offset = nth_packed_object_offset(pack, pos);
> > struct object_info oi = OBJECT_INFO_INIT;
> >  
> > +   if (list->progress)
> > +   display_progress(list->progress, ++list->progress_done);
> 
> Note that add_packed_commits() is used as a callback function for
> for_each_object_in_pack() (with '--stdin-packs') or
> for_each_packed_object() (no options), i.e. this will count the number
> of objects, not commits:
> 
>   $ git rev-list --all |wc -l
>   768524
>   $ git rev-list --objects --all |wc -l
>   6130295
>   # '--count --objects' together didn't work as expected.
>   $ time ~/src/git/git commit-graph write
>   Finding commits for commit graph: 6130295, done.
>   Annotating commits in commit graph: 2305572, done.
>   Computing commit graph generation numbers: 100% (768524/768524), done.
> 
> (Now I also see the 3x difference in the "Annotating commits" counter
> that you mentioned.)
> 
> I see two options:
> 
>   - Provide a different title for this progress counter, e.g.
> "Scanning objects for c-g", or "Processing objects...", or
> something else that says "objects" instead of "commits".
> 
>   - Move this condition and display_progress() call to the end of the
> function, so it will only count commits, not any other objects.
> (As far as I understand both for_each_object_in_pack() and
> for_each_packed_object() iterate in pack .idx order, i.e. it's
> essentially random.  This means that commit objects should be
> distributed evenly among other kinds of objects, so we don't have
> to worry about the counter stalling for a long stretch of
> consecutive non-commit objects.  At least in theory.)
> 
> 
> 


Re: [PATCH] technical doc: add a design doc for the evolve command

2018-11-19 Thread SZEDER Gábor
On Sat, Nov 17, 2018 at 12:30:58PM -0800, Stefan Xenos wrote:
> > Further, I see that this document tries to suggest a proliferation of new 
> > commands
> 
> It does. Let me explain a bit about the reasoning behind this
> breakdown of commands. My main priority was to keep the commands as
> consistent with existing git commands as possible. Secondary goals
> were:
> - Mapping a single intent to a single command where possible makes it
> easier to explain what that command does.
> - Having lots of simpler commands as opposed to a few complex commands
> makes them easier to type.
> - Command names are more descriptive than lettered arguments.

Subcommand names and --long-options are just as descriptive.


> Git already has a "log" and "reflog" command for displaying two
> different types of log,

No, there is 'git log' for displaying logs in various ways, and there
is 'git reflog' which not only displays reflogs, but also operates on
them, e.g. deletes specific reflog entries or expires old entries,
necessitating and justifying the dedicated 'git reflog' command.

> so putting "obslog" on its own command makes
> it consistent with the existing logs, easier to type, and keeps the
> command simple.

> - We could turn "obslog" into an extra option on the "log" command,
> but that would be inconsistent with reflog and would complicate the
> already-complex log command.

On one hand, it's unclear to me what additional operations the
proposed 'git obslog' command will perform besides showing the log of
a change.  If there are no such operations, then it can't really be
compared to 'git reflog' to justify a dedicated 'git obslog' command.

OTOH, note that 'git log' already has a '--walk-reflogs' option, and
indeed 'git reflog [show]' is implemented via the common log
machinery.  And this is not a mere implementation detail, because "git
reflog show accepts any of the options accepted by git log" (quoting
its manpage), making it possible to filter, limit and format reflog
entries, e.g.:

  git reflog --format='%h %cd %s' --author=szeder -5 branch file

I think 'git obslog' should allow the same when showing the log of a
change.


> Personally, I don't
> consider a proliferation of new commands to be inherently bad (or
> inherently good, really). Is there a reason new commands should be
> avoided?

If a user wants to deal with reflogs, then there is 'git help reflog'
which in one manpage describes the concept, and how to list and
expire reflogs and delete individual entries from a reflog using the
various subcommands.  If a user wants to deal with stashes, then there
is 'git help stash', which in one manpage describes the concept, and
how to create, list, show, apply, delete, etc. stashes using the
various subcommands.  See where this is going?  The same applies to
bisect, notes, remotes, rerere, submodules, worktree; maybe there are
more.  This is a Good Thing.

By adding several new commands users will have to consult the manpages
of 'evolve', 'change', 'obslog', etc., even though the commands and
the concepts are closely related.




Re: [PATCH] test-lib-functions: make 'test_cmp_rev' more informative on failure

2018-11-19 Thread SZEDER Gábor
On Mon, Nov 19, 2018 at 02:28:18PM +0100, SZEDER Gábor wrote:
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index d158c8d0bf..fc84db67a1 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -854,9 +854,23 @@ test_must_be_empty () {
>  
>  # Tests that its two parameters refer to the same revision
>  test_cmp_rev () {
> - git rev-parse --verify "$1" >expect.rev &&
> - git rev-parse --verify "$2" >actual.rev &&
> - test_cmp expect.rev actual.rev
> + if test $# != 2
> + then
> + error "bug in the test script: test_cmp_rev requires two 
> revisions, but got $#"

And this here is another "bug in the test script" error that should be
turned into:

  BUG "test_cmp_rev requires two revisions, but got $#"

See: https://public-inbox.org/git/20181119131326.2435-1-szeder@gmail.com/



[PATCH] test-lib-functions: make 'test_cmp_rev' more informative on failure

2018-11-19 Thread SZEDER Gábor
The 'test_cmp_rev' helper is merely a wrapper around 'test_cmp'
checking the output of two 'git rev-parse' commands, which means that
its output on failure is not particularly informative, as it's
basically two OIDs with a bit of extra clutter of the diff header, but
without any indication of which two revisions have caused the failure:

  --- expect.rev  2018-11-17 14:02:11.569747033 +
  +++ actual.rev  2018-11-17 14:02:11.569747033 +
  @@ -1 +1 @@
  -d79ce1670bdcb76e6d1da2ae095e890ccb326ae9
  +139b20d8e6c5b496de61f033f642d0e3dbff528d

It also pollutes the test repo with these two intermediate files,
though that doesn't seem to cause any complications in our current
tests (meaning that I couldn't find any tests that have to work around
the presence of these files by explicitly removing or ignoring them).

Enhance 'test_cmp_rev' to provide a more useful output on failure with
less clutter:

  error: two revisions point to different objects:
'HEAD^': d79ce1670bdcb76e6d1da2ae095e890ccb326ae9
'extra': 139b20d8e6c5b496de61f033f642d0e3dbff528d

Doing so is more convenient when storing the OIDs outputted by 'git
rev-parse' in a local variable each, which, as a bonus, won't pollute
the repository with intermediate files.

While at it, also ensure that 'test_cmp_rev' is invoked with the right
number of parameters, namely two.

Signed-off-by: SZEDER Gábor 
---
 t/test-lib-functions.sh | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index d158c8d0bf..fc84db67a1 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -854,9 +854,23 @@ test_must_be_empty () {
 
 # Tests that its two parameters refer to the same revision
 test_cmp_rev () {
-   git rev-parse --verify "$1" >expect.rev &&
-   git rev-parse --verify "$2" >actual.rev &&
-   test_cmp expect.rev actual.rev
+   if test $# != 2
+   then
+   error "bug in the test script: test_cmp_rev requires two 
revisions, but got $#"
+   else
+   local r1 r2
+   r1=$(git rev-parse --verify "$1") &&
+   r2=$(git rev-parse --verify "$2") &&
+   if test "$r1" != "$r2"
+   then
+   cat >&4 <<-EOF
+   error: two revisions point to different objects:
+ '$1': $r1
+ '$2': $r2
+   EOF
+   return 1
+   fi
+   fi
 }
 
 # Print a sequence of integers in increasing order, either with
-- 
2.20.0.rc0.134.gf0022f8e60



[PATCH] tests: send "bug in the test script" errors to the script's stderr

2018-11-19 Thread SZEDER Gábor
Some of the functions in our test library check that they were invoked
properly with conditions like this:

  test "$#" = 2 ||
  error "bug in the test script: not 2 parameters to test-expect-success"

If this particular condition is triggered, then 'error' will abort the
whole test script with a bold red error message [1] right away.

However, under certain circumstances the test script will be aborted
completely silently, namely if:

  - a similar condition in a test helper function like
'test_line_count' is triggered,
  - which is invoked from the test script's "main" shell [2],
  - and the test script is run manually (i.e. './t1234-foo.sh' as
opposed to 'make t1234-foo.sh' or 'make test') [3]
  - and without the '--verbose' option,

because the error message is printed from within 'test_eval_', where
standard output is redirected either to /dev/null or to a log file.
The only indication that something is wrong is that not all tests in
the script are executed and at the end of the test script's output
there is no "# passed all N tests" message, which are subtle and can
easily go unnoticed, as I had to experience myself.

Send these "bug in the test script" error messages directly to the
test scripts standard error and thus to the terminal, so those bugs
will be much harder to overlook.  Instead of updating all ~20 such
'error' calls with a redirection, let's add a BUG() function to
'test-lib.sh', wrapping an 'error' call with the proper redirection
and also including the common prefix of those error messages, and
convert all those call sites [4] to use this new BUG() function
instead.

[1] That particular error message from 'test_expect_success' is
printed in color only when running with or without '--verbose';
with '--tee' or '--verbose-log' the error is printed without
color, but it is printed to the terminal nonetheless.

[2] If such a condition is triggered in a subshell of a test, then
'error' won't be able to abort the whole test script, but only the
subshell, which in turn causes the test to fail in the usual way,
indicating loudly and clearly that something is wrong.

[3] Well, 'error' aborts the test script the same way when run
manually or by 'make' or 'prove', but both 'make' and 'prove' pay
attention to the test script's exit status, and even a silently
aborted test script would then trigger those tools' usual
noticable error messages.

[4] Strictly speaking, not all those 'error' calls need that
redirection to send their output to the terminal, see e.g.
'test_expect_success' in the opening example, but I think it's
    better to be consistent.

Signed-off-by: SZEDER Gábor 
---
 t/perf/perf-lib.sh  |  4 ++--
 t/t0001-init.sh |  4 ++--
 t/t4013-diff-various.sh |  2 +-
 t/t5516-fetch-push.sh   |  2 +-
 t/t9902-completion.sh   |  2 +-
 t/test-lib-functions.sh | 25 -
 t/test-lib.sh   | 10 +++---
 7 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index 11d1922cf5..2e33ab3ec3 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -82,7 +82,7 @@ test_perf_do_repo_symlink_config_ () {
 
 test_perf_create_repo_from () {
test "$#" = 2 ||
-   error "bug in the test script: not 2 parameters to test-create-repo"
+   BUG "not 2 parameters to test-create-repo"
repo="$1"
source="$2"
source_git="$("$MODERN_GIT" -C "$source" rev-parse --git-dir)"
@@ -184,7 +184,7 @@ test_wrapper_ () {
test_start_
test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
test "$#" = 2 ||
-   error "bug in the test script: not 2 or 3 parameters to 
test-expect-success"
+   BUG "not 2 or 3 parameters to test-expect-success"
export test_prereq
if ! test_skip "$@"
then
diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index 182da069f1..42a263cada 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -319,14 +319,14 @@ test_lazy_prereq GETCWD_IGNORES_PERMS '
base=GETCWD_TEST_BASE_DIR &&
mkdir -p $base/dir &&
chmod 100 $base ||
-   error "bug in test script: cannot prepare $base"
+   BUG "cannot prepare $base"
 
(cd $base/dir && /bin/pwd -P)
status=$?
 
chmod 700 $base &&
rm -rf $base ||
-   error "bug in test script: cannot clean $base"
+   BUG "cannot clean $base"
return $status
 '
 
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 73f7038253..7d985ff6b1 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -129,7 +129,7 @@ do
case "$magic" in
noellipses) ;;
 

Re: [PATCH v3 1/1] merge: add scissors line on merge conflict

2018-11-18 Thread SZEDER Gábor
On Sat, Nov 17, 2018 at 06:32:33PM -0500, Denton Liu wrote:
> diff --git a/t/t7600-merge.sh b/t/t7600-merge.sh
> index 106148254d..0d3db34f08 100755
> --- a/t/t7600-merge.sh
> +++ b/t/t7600-merge.sh
> @@ -247,6 +247,54 @@ test_expect_success 'merge --squash c3 with c7' '
>   test_cmp expect actual
>  '
>  
> +test_expect_success 'merge c3 with c7 with commit.cleanup = scissors' '
> + git config commit.cleanup scissors &&
> + git reset --hard c3 &&
> + test_must_fail git merge c7 &&
> + cat result.9z >file &&
> + git commit --no-edit -a &&
> +
> + {
> + cat <<-EOF
> + Merge tag '"'"'c7'"'"'
> +
> + #  >8 
> + # Do not modify or remove the line above.
> + # Everything below it will be ignored.

Note that these two lines of advice text are translated; see the
consequences below.

> + #
> + # Conflicts:
> + #   file
> + EOF
> + } >expect &&

The {...} block is unnecessary, because there is only a single command
in there.

> + git cat-file commit HEAD | sed -e '1,/^$/d' >actual &&

Please don't run git commands upstream of a pipe, because the pipe
hides their exit code.  Furthermore, put the sed script inside double
quotes, because the whole test is already in a single-quoted block.

I presume you wrote the test this way because you simply followed suit
of the previous test 'merge --squash c3 with c7', which did all the
same.  Bonus points for a preparatory patch that cleans up the
previous test ;)

> + test_cmp expect actual

But most importantly, here 'test_cmp' compares translated advice text
as well, which fails in the GETTEXT_POISON build.  Use 'test_i18ncmp'
instead.

> +'
> +
> +test_expect_success 'merge c3 with c7 with --squash commit.cleanup = 
> scissors' '
> + git config commit.cleanup scissors &&
> + git reset --hard c3 &&
> + test_must_fail git merge --squash c7 &&
> + cat result.9z >file &&
> + git commit --no-edit -a &&
> +
> + {
> + cat <<-EOF
> + Squashed commit of the following:
> +
> + $(git show -s c7)
> +
> + #  >8 
> + # Do not modify or remove the line above.
> + # Everything below it will be ignored.
> + #
> + # Conflicts:
> + #   file
> + EOF
> + } >expect &&
> + git cat-file commit HEAD | sed -e '1,/^$/d' >actual &&
> + test_cmp expect actual

Likewise.

> +'
> +
>  test_debug 'git log --graph --decorate --oneline --all'
>  
>  test_expect_success 'merge c1 with c2 and c3' '
> -- 
> 2.19.1
> 


Re: [RFC/PATCH] read-cache: write all indexes with the same permissions

2018-11-17 Thread SZEDER Gábor
On Sat, Nov 17, 2018 at 07:52:30AM +0100, Christian Couder wrote:
> On Fri, Nov 16, 2018 at 8:20 PM Duy Nguyen  wrote:
> >
> > On Fri, Nov 16, 2018 at 8:07 PM SZEDER Gábor  wrote:
> >
> > > With the default 20% threshold a new shared index is written rather
> > > frequently with our usual small test-repos:
> >
> > Side note. Split index is definitely not meant for small repos.
> 
> I very much agree with that. It makes sense to use them only for big
> repos and big repos usually don't pass a 20% threshold very often.

But our test suite does use very small repositories, so perhaps we
have been already testing what Ævar wanted to test?  (Though I didn't
quite understood what that was; and we likely don't do so explicitly,
but only by chance with GIT_TEST_SPLIT_INDEX=1.)

> > But
> > maybe we should have a lower limit (in terms of absolute number of
> > entries) that prevent splitting. This splitting seems excessive.
> 
> I would agree if split index was the default mode or if our goal was
> to eventually make it the default mode.

Same here.  If you don't need split index, i.e. don't have huge repos,
then don't enable it in the first place.  And if it is enabled in a
small repo, then the extra effort to write a new shared index is
negligible, and the space wasted for those small files doesn't really
matter (though arguably the output from a 'ls .git' would be
surprising...  which, at the same time, would be a good motivating
factor to turn the feature off).



Re: [PATCH v2] read-cache: write all indexes with the same permissions

2018-11-17 Thread SZEDER Gábor
On Fri, Nov 16, 2018 at 08:25:43PM +0100, Christian Couder wrote:
> On Fri, Nov 16, 2018 at 7:29 PM SZEDER Gábor  wrote:
> >
> > On Fri, Nov 16, 2018 at 06:31:05PM +0100, Christian Couder wrote:
> > > diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
> > > index 2ac47aa0e4..fa1d3d468b 100755
> > > --- a/t/t1700-split-index.sh
> > > +++ b/t/t1700-split-index.sh
> > > @@ -381,6 +381,26 @@ test_expect_success 'check 
> > > splitIndex.sharedIndexExpire set to "never" and "now"
> > >   test $(ls .git/sharedindex.* | wc -l) -le 2
> > >  '
> > >
> > > +test_expect_success POSIXPERM 'same mode for index & split index' '
> > > + git init same-mode &&
> > > + (
> > > + cd same-mode &&
> > > + test_commit A &&
> > > + test_modebits .git/index >index_mode &&
> > > + test_must_fail git config core.sharedRepository &&
> > > + git -c core.splitIndex=true status &&
> > > + shared=$(ls .git/sharedindex.*) &&
> >
> > I think the command substitution and 'ls' are unnecessary, and
> >
> >   shared=.git/sharedindex.*
> >
> > would work as well.
> 
> If there is no shared index file with the above we would get:
> 
> shared=.git/sharedindex.*
> $ echo $shared
> .git/sharedindex.*
> 
> which seems bug prone to me.

That's just a non-existing file, for which 'test_modebits' will print
nothing, which, in turn, will not match the modebits of '.git/index'.
And the "there are more than one shared index files" case is handled
by the case statement that was snipped from the email context.



Re: [RFC/PATCH] read-cache: write all indexes with the same permissions

2018-11-16 Thread SZEDER Gábor
On Fri, Nov 16, 2018 at 06:41:43PM +0100, Christian Couder wrote:
> On Tue, Nov 13, 2018 at 6:34 PM Ævar Arnfjörð Bjarmason
>  wrote:

> > I'm asking whether the bug in this patch isn't revealing an existing
> > issue with us not having any tests for N number of sharedindex.*
> > files. I.e. we have >1 of them, merge them and prune them, don't we?

We don't merge shared index files, but write a new one.

> I think we shouldn't have many of them. Usually we should have just
> one, though it's possible that switching the shared index files
> feature on and off several times or using temporary index files could
> create more than one.
>
> And there is clean_shared_index_files() which calls
> should_delete_shared_index() to make sure they are regularly cleaned
> up.

I would think that it's more common to have more than one shared index
files, because a new shared index is written when the number of
entries in the split index reaches the threshold given in
'splitIndex.maxPercentChange'.  The old ones are kept until they
expire, and 'splitIndex.sharedIndexExpire' defaults to 2 weeks (and
can even be be set to "never").

With the default 20% threshold a new shared index is written rather
frequently with our usual small test-repos:

  $ git init
  $ git update-index --split-index
  $ ls -1 .git/*index*
  .git/index
  .git/sharedindex.4370042739b31cd17a5c5cd6043a77c9a00df113
  $ echo 1 >file
  $ git add file
  $ git commit -q -m 1
  $ echo 2 >file
  $ git commit -q -m 2 file
  $ echo 3 >file
  $ git commit -q -m 3 file
  $ ls -1 .git/*index*
  .git/index
  .git/sharedindex.4370042739b31cd17a5c5cd6043a77c9a00df113
  .git/sharedindex.6aedbf71b1a6bdc0018078ec7571e1b21ba4b954
  .git/sharedindex.b9106e9b82a818a0e4e9148224fc44ea98f488a1
  .git/sharedindex.bad0b75d270a431b9e961cfc15df6ec935a67be8

> Anyway it's a different topic and according to what we privately
> discussed I just sent
> https://public-inbox.org/git/20181116173105.21784-1-chrisc...@tuxfamily.org/
> to fix the current issue.


Re: [PATCH v2] read-cache: write all indexes with the same permissions

2018-11-16 Thread SZEDER Gábor
On Fri, Nov 16, 2018 at 06:31:05PM +0100, Christian Couder wrote:
> diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
> index 2ac47aa0e4..fa1d3d468b 100755
> --- a/t/t1700-split-index.sh
> +++ b/t/t1700-split-index.sh
> @@ -381,6 +381,26 @@ test_expect_success 'check splitIndex.sharedIndexExpire 
> set to "never" and "now"
>   test $(ls .git/sharedindex.* | wc -l) -le 2
>  '
>  
> +test_expect_success POSIXPERM 'same mode for index & split index' '
> + git init same-mode &&
> + (
> + cd same-mode &&
> + test_commit A &&
> + test_modebits .git/index >index_mode &&
> + test_must_fail git config core.sharedRepository &&
> + git -c core.splitIndex=true status &&
> + shared=$(ls .git/sharedindex.*) &&

I think the command substitution and 'ls' are unnecessary, and

  shared=.git/sharedindex.*

would work as well.

> + case "$shared" in
> + *" "*)
> + # we have more than one???
> + false ;;
> + *)
> + test_modebits "$shared" >split_index_mode &&
> + test_cmp index_mode split_index_mode ;;
> + esac
> + )
> +'
> +
>  while read -r mode modebits
>  do
>   test_expect_success POSIXPERM "split index respects 
> core.sharedrepository $mode" '
> -- 
> 2.19.1.1053.g063ed687ac
> 


Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-11-16 Thread SZEDER Gábor
On Fri, Nov 16, 2018 at 02:09:07PM +0900, Junio C Hamano wrote:
> Jeff King  writes:
> 
> > On Thu, Nov 15, 2018 at 04:38:44AM -0500, Jeff King wrote:
> >
> >> Is SOURCE_NONE a complete match for what we want?
> >> 
> >> I see problems in both directions:
> >> 
> >>  - sorting by "objectname" works now, but it's marked with SOURCE_OBJ,
> >>and would be forbidden with your patch.  I'm actually not sure if
> >>SOURCE_OBJ is accurate; we shouldn't need to access the object to
> >>show it (and we are probably wasting effort loading the full contents
> >>for tools like for-each-ref).
> >> 
> >>However, that's not the full story. For objectname:short, it _does_ call
> >>find_unique_abbrev(). So we expect to have an object directory.
> >
> > Oops, I'm apparently bad at reading. It is in fact SOURCE_OTHER, which
> > makes sense (outside of this whole "--sort outside a repo thing").
> >
> > But we'd ideally distinguish between "objectname" (which should be OK
> > outside a repo) and "objectname:short" (which currently segfaults).
> 
> Arguably, use of ref-filter machinery in ls-remote, whether it is
> given from inside or outside a repo, was a mistake in 1fb20dfd
> ("ls-remote: create '--sort' option", 2018-04-09), as the whole
> point of "ls-remote" is to peek the list of refs and it is perfectly
> normal that the objects listed are not available.

I hope that one day 'git ls-remote' will learn to '--format=...' its
output, and I think that (re)using the ref-filter machinery would be
the right way to go to achive that.  Sure, ref-filter supports a lot
of format specifiers that don't at all make sense in the context of
'ls-remote' (perhaps we should have a dedicated set of valid_atoms for
that), but I think it's perfectly reasonable to do something like:

  git ls-remote --format=%(refname:strip=2) remote

A concrete use case for that could be to eliminate the last remaining
shell loops from refs completion.



Re: [PATCH v3 11/11] fast-export: add a --show-original-ids option to show original names

2018-11-16 Thread SZEDER Gábor
On Thu, Nov 15, 2018 at 11:59:56PM -0800, Elijah Newren wrote:

> diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
> index d7d73061d0..5690fe2810 100755
> --- a/t/t9350-fast-export.sh
> +++ b/t/t9350-fast-export.sh
> @@ -77,6 +77,23 @@ test_expect_success 'fast-export 
> --reference-excluded-parents master~2..master'
>test $MASTER = $(git rev-parse --verify refs/heads/rewrite))
>  '
>  
> +test_expect_success 'fast-export --show-original-ids' '
> +
> + git fast-export --show-original-ids master >output &&
> + grep ^original-oid output| sed -e s/^original-oid.// | sort >actual &&

Nit: 'sed' can do what this 'grep' does:

  sed -n -e s/^original-oid.//p output | sort >actual &&

thus sparing a process.

> + git rev-list --objects master muss >objects-and-names &&
> + awk "{print \$1}" objects-and-names | sort >commits-trees-blobs &&
> + comm -23 actual commits-trees-blobs >unfound &&
> + test_must_be_empty unfound
> +'
> +
> +test_expect_success 'fast-export --show-original-ids | git fast-import' '
> +
> + git fast-export --show-original-ids master muss | git fast-import 
> --quiet &&
> + test $MASTER = $(git rev-parse --verify refs/heads/master) &&
> + test $MUSS = $(git rev-parse --verify refs/tags/muss)
> +'
> +
>  test_expect_success 'iso-8859-1' '
>  
>   git config i18n.commitencoding ISO8859-1 &&
> -- 
> 2.19.1.1063.g1796373474.dirty
> 


Re: [PATCH v2 08/11] fast-export: add --reference-excluded-parents option

2018-11-14 Thread SZEDER Gábor
On Tue, Nov 13, 2018 at 04:25:57PM -0800, Elijah Newren wrote:
> diff --git a/builtin/fast-export.c b/builtin/fast-export.c
> index 2fef00436b..3cc98c31ad 100644
> --- a/builtin/fast-export.c
> +++ b/builtin/fast-export.c
> @@ -37,6 +37,7 @@ static int fake_missing_tagger;
>  static int use_done_feature;
>  static int no_data;
>  static int full_tree;
> +static int reference_excluded_commits;
>  static struct string_list extra_refs = STRING_LIST_INIT_NODUP;
>  static struct string_list tag_refs = STRING_LIST_INIT_NODUP;
>  static struct refspec refspecs = REFSPEC_INIT_FETCH;
> @@ -596,7 +597,8 @@ static void handle_commit(struct commit *commit, struct 
> rev_info *rev,
>   message += 2;
>  
>   if (commit->parents &&
> - get_object_mark(>parents->item->object) != 0 &&
> + (get_object_mark(>parents->item->object) != 0 ||
> +  reference_excluded_commits) &&
>   !full_tree) {
>   parse_commit_or_die(commit->parents->item);
>   diff_tree_oid(get_commit_tree_oid(commit->parents->item),
> @@ -644,13 +646,21 @@ static void handle_commit(struct commit *commit, struct 
> rev_info *rev,
>   unuse_commit_buffer(commit, commit_buffer);
>  
>   for (i = 0, p = commit->parents; p; p = p->next) {
> - int mark = get_object_mark(>item->object);
> - if (!mark)
> + struct object *obj = >item->object;
> + int mark = get_object_mark(obj);
> +
> + if (!mark && !reference_excluded_commits)
>   continue;
>   if (i == 0)
> - printf("from :%d\n", mark);
> + printf("from ");
> + else
> + printf("merge ");
> + if (mark)
> + printf(":%d\n", mark);
>   else
> - printf("merge :%d\n", mark);
> + printf("%s\n", sha1_to_hex(anonymize ?
> +anonymize_sha1(>oid) :
> +obj->oid.hash));

Since we intend to move away from SHA-1, would this be a good time to
add an anonymize_oid() function, "while at it"?

>   i++;
>   }
>  
> @@ -931,13 +941,22 @@ static void handle_tags_and_duplicates(struct 
> string_list *extras)
>   /*
>* Getting here means we have a commit which
>* was excluded by a negative refspec (e.g.
> -  * fast-export ^master master).  If the user
> +  * fast-export ^master master).  If we are
> +  * referencing excluded commits, set the ref
> +  * to the exact commit.  Otherwise, the user
>* wants the branch exported but every commit
> -  * in its history to be deleted, that sounds
> -  * like a ref deletion to me.
> +  * in its history to be deleted, which basically
> +  * just means deletion of the ref.
>*/
> - printf("reset %s\nfrom %s\n\n",
> -name, sha1_to_hex(null_sha1));
> + if (!reference_excluded_commits) {
> + /* delete the ref */
> + printf("reset %s\nfrom %s\n\n",
> +name, sha1_to_hex(null_sha1));
> + continue;
> + }
> + /* set ref to commit using oid, not mark */
> + printf("reset %s\nfrom %s\n\n", name,
> +sha1_to_hex(commit->object.oid.hash));

Please use oid_to_hex(>object.oid) instead.

>   continue;
>   }
>  


Re: [PATCH v2 04/11] fast-export: avoid dying when filtering by paths and old tags exist

2018-11-14 Thread SZEDER Gábor
On Tue, Nov 13, 2018 at 04:25:53PM -0800, Elijah Newren wrote:
> diff --git a/builtin/fast-export.c b/builtin/fast-export.c
> index af724e9937..b984a44224 100644
> --- a/builtin/fast-export.c
> +++ b/builtin/fast-export.c
> @@ -774,9 +774,12 @@ static void handle_tag(const char *name, struct tag *tag)
>   break;
>   if (!(p->object.flags & TREESAME))
>   break;
> - if (!p->parents)
> - die("can't find replacement commit for 
> tag %s",
> -  oid_to_hex(>object.oid));
> + if (!p->parents) {
> + printf("reset %s\nfrom %s\n\n",
> +name, sha1_to_hex(null_sha1));

Please use oid_to_hex(_oid) instead.

> + free(buf);
> + return;
> + }
>   p = p->parents->item;
>   }
>   tagged_mark = get_object_mark(>object);


[PATCH] ref-filter: don't look for objects when outside of a repository

2018-11-14 Thread SZEDER Gábor
The command 'git ls-remote --sort=authordate ' segfaults when
run outside of a repository, ever since the introduction of its
'--sort' option in 1fb20dfd8e (ls-remote: create '--sort' option,
2018-04-09).

While in general the 'git ls-remote' command can be run outside of a
repository just fine, its '--sort=' option with certain keys does
require access to the referenced objects.  This sorting is implemented
using the generic ref-filter sorting facility, which already handles
missing objects gracefully with the appropriate 'missing object
deadbeef for HEAD' message.  However, being generic means that it
checks replace refs while trying to retrieve an object, and while
doing so it accesses the 'git_replace_ref_base' variable, which has
not been initialized and is still a NULL pointer when outside of a
repository, thus causing the segfault.

Make ref-filter more careful upfront while parsing the format string,
and make it error out when encountering a format atom requiring object
access when we are not in a repository.  Also add a test to ensure
that 'git ls-remote --sort' fails gracefully when executed outside of
a repository.

Reported-by: H.Merijn Brand 
Signed-off-by: SZEDER Gábor 
---

On Tue, Sep 25, 2018 at 01:57:38PM -0700, Junio C Hamano wrote:
> SZEDER Gábor  writes:
> 
> > However, if we go for a more informative error message, then wouldn't
> > it be better to add this condition in populate_value() before it even
> > calls get_object()?  Then we could also add the problematic format
> > specifier to the error message (I think, but didn't actually check),
> > just in case someone specified multiple sort keys.
> 
> Even though I suspect that verify_ref_format() is the logically the
> right place to do this (after all, it is about seeing if the format
> makes sense, and a format that requires an object access used
> outside a repository should trigger an verification error), doing
> that in populate_value() probably strikes the best balance, I would
> think.

We are dealing with format specifiers used for sorting here, and those
don't go through verify_ref_format().

So how about this patch instead?

I think it will catch all cases where a user would try to use a format
specifier, for any purpose, requiring object access outside of a
repository (though I don't know whether there are any other cases
besides 'git ls-remote --sort=...'; but perhaps in the future
'ls-remote' will get a '--format' option as well), and it does so
before performing a potentially expensive query to the remote.  OTOH,
it won't change the documented "missing object" error message when run
inside a repo but the necessary object is indeed missing.


 ref-filter.c | 4 
 t/t5512-ls-remote.sh | 6 ++
 2 files changed, 10 insertions(+)

diff --git a/ref-filter.c b/ref-filter.c
index 0c45ed9d94..a1290659af 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -534,6 +534,10 @@ static int parse_ref_filter_atom(const struct ref_format 
*format,
if (ARRAY_SIZE(valid_atom) <= i)
return strbuf_addf_ret(err, -1, _("unknown field name: %.*s"),
   (int)(ep-atom), atom);
+   if (valid_atom[i].source != SOURCE_NONE && !have_git_dir())
+   return strbuf_addf_ret(err, -1,
+  _("not a git repository, but the field 
'%.*s' requires access to object data"),
+  (int)(ep-atom), atom);
 
/* Add it in, including the deref prefix */
at = used_atom_cnt;
diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
index 91ee6841c1..32e722db2e 100755
--- a/t/t5512-ls-remote.sh
+++ b/t/t5512-ls-remote.sh
@@ -302,6 +302,12 @@ test_expect_success 'ls-remote works outside repository' '
nongit git ls-remote dst.git
 '
 
+test_expect_success 'ls-remote --sort fails gracefully outside repository' '
+   # Use a sort key that requires access to the referenced objects.
+   nongit test_must_fail git ls-remote --sort=authordate 
"$TRASH_DIRECTORY" 2>err &&
+   test_i18ngrep "^fatal: not a git repository, but the field 
'\''authordate'\'' requires access to object data" err
+'
+
 test_expect_success 'ls-remote patterns work with all protocol versions' '
git for-each-ref --format="%(objectname)%(refname)" \
refs/heads/master refs/remotes/origin/master >expect &&
-- 
2.19.1.1182.gbfcc7ed3e6



Re: [PATCH v3 1/1] protocol: advertise multiple supported versions

2018-11-14 Thread SZEDER Gábor
On Wed, Nov 14, 2018 at 11:44:51AM +0900, Junio C Hamano wrote:
> SZEDER Gábor  writes:
> 
> >> +  if (tmp_allowed_versions[0] != config_version)
> >> +  for (int i = 1; i < nr_allowed_versions; i++)
> >
> > We don't do C99 yet, thus the declaration of a loop variable like this
> > is not allowed and triggers compiler errors.
> 
> I thought we did a weather-balloon to see if this bothers people who
> build on minority platforms but
> 
>   git grep 'for (int'
> 
> is coming up empty.
> 
> We have been trying designated initializers with weather-balloon
> changes (both arrays and struct fields) and I somehow thought that
> we already were trying this out, but apparently that is not the
> case.

I thought so as well, and I run the exact same 'git grep' command
before replying to Josh :)

There was a discussion about such a weather-balloon patch [1] (which
happened to use an unsigned loop variable, so our grep wouldn't have
found it anyway), but it wasn't picked up because it required new
options in CFLAGS and there was no standard way to do so [2].

[1] https://public-inbox.org/git/20170719181956.15845-1-sbel...@google.com/
[2] 
https://public-inbox.org/git/20170724170813.scceigybl5d3f...@sigill.intra.peff.net/



[PATCH 1/3] clone: use a more appropriate variable name for the default refspec

2018-11-14 Thread SZEDER Gábor
cmd_clone() declares two strbufs 'key' and 'value' on the same line,
suggesting that they are used to contruct a config variable's name and
value.  However, this is not the case: 'key' is used to construct the
names of multiple config variables, while 'value' is never used as a
value for any of those config variables, or for any other config
variable for that matter, but only to contruct the default fetch
refspec.

Let's rename 'value' to 'default_refspec' to make the intent clearer.

Signed-off-by: SZEDER Gábor 
---
 builtin/clone.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 15b142d646..ae1bf242c6 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -890,7 +890,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
const struct ref *our_head_points_at;
struct ref *mapped_refs;
const struct ref *ref;
-   struct strbuf key = STRBUF_INIT, value = STRBUF_INIT;
+   struct strbuf key = STRBUF_INIT;
+   struct strbuf default_refspec = STRBUF_INIT;
struct strbuf branch_top = STRBUF_INIT, reflog_msg = STRBUF_INIT;
struct transport *transport = NULL;
const char *src_ref_prefix = "refs/heads/";
@@ -1067,7 +1068,6 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
strbuf_addf(_top, "refs/remotes/%s/", option_origin);
}
 
-   strbuf_addf(, "+%s*:%s*", src_ref_prefix, branch_top.buf);
strbuf_addf(, "remote.%s.url", option_origin);
git_config_set(key.buf, repo);
strbuf_reset();
@@ -1081,9 +1081,9 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (option_required_reference.nr || option_optional_reference.nr)
setup_reference();
 
-   refspec_append(, value.buf);
-
-   strbuf_reset();
+   strbuf_addf(_refspec, "+%s*:%s*", src_ref_prefix,
+   branch_top.buf);
+   refspec_append(, default_refspec.buf);
 
remote = remote_get(option_origin);
transport = transport_get(remote, remote->url[0]);
@@ -1240,7 +1240,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
strbuf_release(_msg);
strbuf_release(_top);
strbuf_release();
-   strbuf_release();
+   strbuf_release(_refspec);
junk_mode = JUNK_LEAVE_ALL;
 
refspec_clear();
-- 
2.19.1.1182.gbfcc7ed3e6



[PATCH 0/3] clone: respect configured fetch respecs during initial fetch

2018-11-14 Thread SZEDER Gábor


The initial fetch during a clone doesn't transfer refs matching
additional fetch refspecs given on the command line as configuration
variables, e.g. '-c remote.origin.fetch='.  This contradicts
the documentation stating that configuration variables specified via
'git clone -c = ...' "take effect immediately after the
repository is initialized, but before the remote history is fetched"
and the given example specifically mentions "adding additional fetch
refspecs to the origin remote".

The second patch in this series makes it work.  The other two are
while-at-its: the first is a little cleanup, and the last one
documents other known ignored configuration variables (but whose
functionality is available through command line options).


This patch series should have been marked as v6, but I chose to reset
the counter, because:

  - v5 was sent out way over a year ago [1], and surely everybody has
forgotten about it since then anyway.  But more importantly:

  - A lot has happened since then, most notably we now have a refspec
API, which makes this patch series much simpler (now it only
touches 'builtin/clone.c', the previous version had to add stuff
to 'remote.{c,h}' as well).


[1] For reference, though I actually doubt it's worth looking up:
https://public-inbox.org/git/20170616173849.8071-1-szeder....@gmail.com/T/#u


SZEDER Gábor (3):
  clone: use a more appropriate variable name for the default refspec
  clone: respect additional configured fetch refspecs during initial
fetch
  Documentation/clone: document ignored configuration variables

 Documentation/git-clone.txt |  6 +
 builtin/clone.c | 33 +++---
 t/t5611-clone-config.sh | 47 +
 3 files changed, 72 insertions(+), 14 deletions(-)

-- 
2.19.1.1182.gbfcc7ed3e6



[PATCH 2/3] clone: respect additional configured fetch refspecs during initial fetch

2018-11-14 Thread SZEDER Gábor
The initial fetch during a clone doesn't transfer refs matching
additional fetch refspecs given on the command line as configuration
variables, e.g. '-c remote.origin.fetch='.  This contradicts
the documentation stating that configuration variables specified via
'git clone -c = ...' "take effect immediately after the
repository is initialized, but before the remote history is fetched"
and the given example specifically mentions "adding additional fetch
refspecs to the origin remote".  Furthermore, one-shot configuration
variables specified via 'git -c = clone ...', though not
written to the newly created repository's config file, live during the
lifetime of the 'clone' command, including the initial fetch.  All
this implies that any fetch refspecs specified this way should already
be taken into account during the initial fetch.

The reason for this is that the initial fetch is not a fully fledged
'git fetch' but a bunch of direct calls into the fetch/transport
machinery with clone's own refs-to-refspec matching logic, which
bypasses parts of 'git fetch' processing configured fetch refspecs.
This logic only considers a single default refspec, potentially
influenced by options like '--single-branch' and '--mirror'.  The
configured refspecs are, however, already read and parsed properly
when clone calls remote.c:remote_get(), but it never looks at the
parsed refspecs in the resulting 'struct remote'.

Modify clone to take the remote's configured fetch refspecs into
account to retrieve all matching refs during the initial fetch.  Note
that we have to explicitly add the default fetch refspec to the
remote's refspecs, because at that point the remote only includes the
fetch refspecs specified on the command line.

Add tests to check that refspecs given both via 'git clone -c ...' and
'git -c ... clone' retrieve all refs matching either the default or
the additional refspecs, and that it works even when the user
specifies an alternative remote name via '--origin='.

Signed-off-by: SZEDER Gábor 
---
 builtin/clone.c | 25 +-
 t/t5611-clone-config.sh | 47 +
 2 files changed, 62 insertions(+), 10 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index ae1bf242c6..7c7f98c72c 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -548,7 +548,7 @@ static struct ref *find_remote_branch(const struct ref 
*refs, const char *branch
 }
 
 static struct ref *wanted_peer_refs(const struct ref *refs,
-   struct refspec_item *refspec)
+   struct refspec *refspec)
 {
struct ref *head = copy_ref(find_ref_by_name(refs, "HEAD"));
struct ref *local_refs = head;
@@ -569,13 +569,19 @@ static struct ref *wanted_peer_refs(const struct ref 
*refs,
warning(_("Could not find remote branch %s to clone."),
option_branch);
else {
-   get_fetch_map(remote_head, refspec, , 0);
+   int i;
+   for (i = 0; i < refspec->nr; i++)
+   get_fetch_map(remote_head, >items[i],
+ , 0);
 
/* if --branch=tag, pull the requested tag explicitly */
get_fetch_map(remote_head, tag_refspec, , 0);
}
-   } else
-   get_fetch_map(refs, refspec, , 0);
+   } else {
+   int i;
+   for (i = 0; i < refspec->nr; i++)
+   get_fetch_map(refs, >items[i], , 0);
+   }
 
if (!option_mirror && !option_single_branch && !option_no_tags)
get_fetch_map(refs, tag_refspec, , 0);
@@ -899,7 +905,6 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
int err = 0, complete_refs_before_fetch = 1;
int submodule_progress;
 
-   struct refspec rs = REFSPEC_INIT_FETCH;
struct argv_array ref_prefixes = ARGV_ARRAY_INIT;
 
fetch_if_missing = 0;
@@ -1081,11 +1086,12 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (option_required_reference.nr || option_optional_reference.nr)
setup_reference();
 
+   remote = remote_get(option_origin);
+
strbuf_addf(_refspec, "+%s*:%s*", src_ref_prefix,
branch_top.buf);
-   refspec_append(, default_refspec.buf);
+   refspec_append(>fetch, default_refspec.buf);
 
-   remote = remote_get(option_origin);
transport = transport_get(remote, remote->url[0]);
transport_set_verbosity(transport, option_verbosity, option_progress);
transport->family = family;
@@ -1140,7 +1146,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
 
 
argv_array_push(_prefixes, "HEAD");
-   refspec_ref_prefixes(, _prefixes);
+ 

[PATCH 3/3] Documentation/clone: document ignored configuration variables

2018-11-14 Thread SZEDER Gábor
Due to limitations in the current implementation, some configuration
variables specified via 'git clone -c var=val' (or 'git -c var=val
clone') are ignored during the initial fetch and checkout.

Let the users know which configuration variables are known to be
ignored ('remote.origin.mirror' and 'remote.origin.tagOpt') under the
documentation of 'git clone -c', along with hints to use the options
'--mirror' and '--no-tags' instead.

Signed-off-by: SZEDER Gábor 
---
 Documentation/git-clone.txt | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index a55536f0bf..2fd12524f9 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -189,6 +189,12 @@ objects from the source repository into a pack in the 
cloned repository.
values are given for the same key, each value will be written to
the config file. This makes it safe, for example, to add
additional fetch refspecs to the origin remote.
++
+Due to limitations of the current implementation, some configuration
+variables do not take effect until after the initial fetch and checkout.
+Configuration variables known to not take effect are:
+`remote..mirror` and `remote..tagOpt`.  Use the
+corresponding `--mirror` and `--no-tags` options instead.
 
 --depth ::
Create a 'shallow' clone with a history truncated to the
-- 
2.19.1.1182.gbfcc7ed3e6



Re: [PATCH v3 1/1] protocol: advertise multiple supported versions

2018-11-13 Thread SZEDER Gábor
On Tue, Nov 13, 2018 at 03:03:47PM -0800, Josh Steadmon wrote:
> > > + for (int i = 1; i < nr_allowed_versions; i++)
> > 
> > We don't do C99 yet, thus the declaration of a loop variable like this
> > is not allowed and triggers compiler errors.

> Sorry about that. Will fix in v4. Out of curiousity, do you have a
> config.mak snippet that will make these into errors? I played around
> with adding combinations of -ansi, -std=c89, and -pedantic to CFLAGS,
> but I couldn't get anything that detect the problem without also
> breaking on other parts of the build.

Unfortunately, I don't have such an universal CFLAGS.

With gcc-4.8 the default CFLAGS is sufficient:

  $ make V=1 CC=gcc-4.8 protocol.o
  gcc-4.8 -o protocol.o -c -MF ./.depend/protocol.o.d -MQ protocol.o -MMD -MP  
-g -O2 -Wall -I. -DHAVE_SYSINFO -DGIT_HOST_CPU="\"x86_64\"" -DHAVE_ALLOCA_H 
-DUSE_CURL_FOR_IMAP_SEND -DSHA1_DC -DSHA1DC_NO_STANDARD_INCLUDES 
-DSHA1DC_INIT_SAFE_HASH_DEFAULT=0 -DSHA1DC_CUSTOM_INCLUDE_SHA1_C="\"cache.h\"" 
-DSHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C="\"git-compat-util.h\""  -DHAVE_PATHS_H 
-DHAVE_DEV_TTY -DHAVE_CLOCK_GETTIME -DHAVE_CLOCK_MONOTONIC -DHAVE_GETDELIM 
'-DPROCFS_EXECUTABLE_PATH="/proc/self/exe"'  -DFREAD_READS_DIRECTORIES 
-DNO_STRLCPY -DSHELL_PATH='"/bin/sh"' -DPAGER_ENV='"LESS=FRX LV=-c"'  protocol.c
  protocol.c: In function ‘get_client_protocol_version_advertisement’:
  protocol.c:112:3: error: ‘for’ loop initial declarations are only allowed in 
C99 mode
 for (int i = 1; i < nr_allowed_versions; i++)
 ^
  < ... snip ... >

I couldn't get this error with any newer GCC or Clang, and using
options like -std=c89 trigger many other errors as well, just like you
mentioned.




Re: [PATCH v3 1/1] protocol: advertise multiple supported versions

2018-11-13 Thread SZEDER Gábor
On Mon, Nov 12, 2018 at 01:49:05PM -0800, stead...@google.com wrote:

> diff --git a/protocol.c b/protocol.c
> index 5e636785d1..54d2ab991b 100644
> --- a/protocol.c
> +++ b/protocol.c

> +void get_client_protocol_version_advertisement(struct strbuf *advert)
> +{
> + int tmp_nr = nr_allowed_versions;
> + enum protocol_version *tmp_allowed_versions, config_version;
> + strbuf_reset(advert);
> +
> + have_advertised_versions_already = 1;
> +
> + config_version = get_protocol_version_config();
> + if (config_version == protocol_v0) {
> + strbuf_addstr(advert, "version=0");
> + return;
> + }
> +
> + if (tmp_nr > 0) {
> + ALLOC_ARRAY(tmp_allowed_versions, tmp_nr);
> + copy_array(tmp_allowed_versions, allowed_versions, tmp_nr,
> +sizeof(enum protocol_version));
> + } else {
> + ALLOC_ARRAY(tmp_allowed_versions, 1);
> + tmp_nr = 1;
> + tmp_allowed_versions[0] = config_version;
> + }
> +
> + if (tmp_allowed_versions[0] != config_version)
> + for (int i = 1; i < nr_allowed_versions; i++)

We don't do C99 yet, thus the declaration of a loop variable like this
is not allowed and triggers compiler errors.

> + if (tmp_allowed_versions[i] == config_version) {
> + enum protocol_version swap =
> + tmp_allowed_versions[0];
> + tmp_allowed_versions[0] =
> + tmp_allowed_versions[i];
> + tmp_allowed_versions[i] = swap;
> + }
> +
> + strbuf_addf(advert, "version=%s",
> + format_protocol_version(tmp_allowed_versions[0]));
> + for (int i = 1; i < tmp_nr; i++)

Likewise.

> + strbuf_addf(advert, ":version=%s",
> + format_protocol_version(tmp_allowed_versions[i]));
> +}


Re: [PATCH] coccicheck: introduce 'pending' semantic patches

2018-11-13 Thread SZEDER Gábor
On Fri, Nov 09, 2018 at 04:10:52PM -0800, Stefan Beller wrote:
> From: SZEDER Gábor 
> 
> Teach `make coccicheck` to avoid patches named "*.pending.cocci" and
> handle them separately in a new `make coccicheck-pending` instead.
> This means that we can separate "critical" patches from "FYI" patches.
> The former target can continue causing Travis to fail its static
> analysis job, while the latter can let us keep an eye on ongoing
> (pending) transitions without them causing too much fallout.
> 
> Document the intended use-cases around these two targets.
> As the process around the pending patches is not yet fully explored,
> leave that out.
> 
> Based-on-work-by: SZEDER Gábor 

Hm, do I need to sign off?
Well, I sign off now anyway, and then Junio can take it if necessary
or just ignore it if not.

Signed-off-by: SZEDER Gábor 

> Signed-off-by: Stefan Beller 
> Signed-off-by: Junio C Hamano 
> ---
> 
> I dialed back on the workflow, as we may want to explore it first
> before writing it down.

Yeah, I too think that's better to wait with the workflow details
until we gather some experience about how it works out in practice.

Thanks for following it through.



Re: [PATCH] Makefile: add pending semantic patches

2018-11-13 Thread SZEDER Gábor
On Fri, Nov 09, 2018 at 01:58:01PM -0800, Stefan Beller wrote:
> On Thu, Nov 8, 2018 at 9:18 PM Junio C Hamano  wrote:
> > Are they wrong changes (e.g. a carelessly written read_cache() to
> > read_index(_index) conversion may munge the implementation of
> > read_cache(...) { return read_index(_index, ...); } and make
> > inifinite recursion)?  Or are they "correct but not immediately
> > necessary" (e.g. because calling read_cache() does not hurt until
> > that function gets removed, so rewriting the callers to call
> > read_index() with _index may be correct but not immediately
> > necessary)?
> 
> the latter. I assume correctness (of the semantic patch) to be a given,

I'm afraid we can't assume that.  As far as correctness is concerned,
I think semantic patches are not different from any other code we
write, i.e. they are potentially buggy.  Perhaps even more so than the
"usual" Git code, because we have long experience writing C and shell
code (with all their quirks and error-proneness), but none of us is
really an expert in writing semantic patches.

Cases in point:

  - 6afedba8c9 (object_id.cocci: match only expressions of type
'struct object_id', 2018-10-15)
  - 279ffad17d (coccinelle: avoid wrong transformation suggestions
from commit.cocci, 2018-04-30)
  - cd9a4b6d93 (cocci: use format keyword instead of a literal string,
2018-01-19); though this one is probably a bug in Coccinelle
itself
  - c2bb0c1d1e (cocci: avoid self-references in object_id
transformations, 2016-11-01)



Re: [PATCH] travis-ci: install packages in 'ci/install-dependencies.sh'

2018-11-08 Thread SZEDER Gábor
On Fri, Nov 02, 2018 at 11:25:17AM +0900, Junio C Hamano wrote:
> SZEDER Gábor  writes:
> 
> > Ever since we started using Travis CI, we specified the list of
> > packages to install in '.travis.yml' via the APT addon.  While running
> > our builds on Travis CI's container-based infrastructure we didn't
> > have another choice, because that environment didn't support 'sudo',
> > and thus we didn't have permission to install packages ourselves.  With
> > the switch to the VM-based infrastructure in the previous patch we do
> > get a working 'sudo', so we can install packages by running 'sudo
> > apt-get -y install ...' as well.
> 
> OK, so far we learned that this is now _doable_; but not enough to
> decide if this is a good thing to do or not.  Let's read on to find
> out.

Yeah, this paragraph is just a bit of background about how the current
situation came to be and what recent change made the switch possible.

> > Let's make use of this and install necessary packages in
> > 'ci/install-dependencies.sh', so all the dependencies (i.e. both
> > packages and "non-packages" (P4 and Git-LFS)) are handled in the same
> > file.  
> 
> So we used to have two waysto prepare the test environment; non
> packaged software were done via install-dependencies.sh, but
> packaged ones weren't.  Unifying them so that the script installs
> both would be a good change to simplify the procedure.  
> 
> Is that how this sentence argues for this change?

Yes.

> > Install gcc-8 only in the 'linux-gcc' build job; so far it has
> > been unnecessarily installed in the 'linux-clang' build job as well.
> 
> Is this "unneeded gcc-8 was installed" something we can fix only
> because we now stopped doing the installation via apt addon?

Now that you mention it: no.  It would have been possible to install
gcc-8 only in the 'linux-gcc' build job even via the apt addon, namely
by removing the two Linux build jobs from the implicit build matrix
and adding them as two independent build jobs in the 'matrix.include'
section of '.travis.yml'.  The drawback is that all the extra packages
used in both build jobs would have to be duplicated.

> Or we
> could have fixed it while we were on apt addon but we didn't bother,
> and this patch fixes it "while at it"---primarily because the shell
> script is far more flexible to work with than travis.yml matrix and
> this kind of customization is far easier to do?

Basically yes (though I think it's not about not bothering; I don't
know about others, but it just occured to me that it would have been
doable, however, even if it occured to me earlier, because of the
duplicated list of common packages I wouldn't have done it).

Doing it in good old shell is indeed easier and the common packages
are then only listed once.

> > Print the versions of P4 and Git-LFS conditionally, i.e. only when
> > they have been installed; with this change even the static analysis
> > and documentation build jobs start using 'ci/install-dependencies.sh'
> > to install packages, and neither of these two build jobs depend on and
> > thus install those.
> >
> > This change will presumably be beneficial for the upcoming Azure
> > Pipelines integration [1]: preliminary versions of that patch series
> > run a couple of 'apt-get' commands to install the necessary packages
> > before running 'ci/install-dependencies.sh', but with this patch it
> > will be sufficient to run only 'ci/install-dependencies.sh'.
> 
> So the main point of this change is to have less knowledge to
> prepare the target configuration in the .travis.yml file and keep
> them all in ci/install-dependencies.sh, which hopefully is more
> reusable than .travis.yml in a non Travis environment?

Oh, "more reusable" indeed, that's a more eloquent way to put it.

> If that is the case, it makes sense to me.
> 
> > This patch should go on top of 'ss/travis-ci-force-vm-mode'.
> >
> > I'm not sure about the last paragraph, because:
> >
> >   - It talks about presumed benefits for a currently still
> > work-in-progress patch series of an other contributor, and I'm not
> > really sure that that's a good thing.  Perhaps I should have
> > rather put it below the '---'.
> >
> >   - I'm confused about the name of this Azure thing.  The cover letter
> > mentions "Azure Pipelines", the file is called
> > 'azure-pipelines.yml', but the relevant patch I link to talks
> > about "Azure DevOps" in the commit message.
> >
> > Anyway, keep that last paragraph or drop it as you see fit.
> 
> I hope we'll hear from Dscho in one or two revolutions of the Earth
> ;-)

... revolutions around what? :)



Re: [PATCH v3] i18n: make GETTEXT_POISON a runtime option

2018-11-08 Thread SZEDER Gábor
On Thu, Nov 08, 2018 at 09:26:19PM +0100, Ævar Arnfjörð Bjarmason wrote:
> On Fri, Nov 02 2018, SZEDER Gábor wrote:

> >>  * We error out in the Makefile if you're still saying
> >>GETTEXT_POISON=YesPlease.
> >>
> >>This makes more sense than just making it a synonym since now this
> >>also needs to be defined at runtime.
> >
> > OK, I think erroring out is better than silently ignoring
> > GETTEXT_POISON=YesPlease.  However, the commit message only mentions
> > that GETTEXT_POISON is gone, but perhaps this should be mentioned
> > there as well.
> 
> Will mention.

With the benefit of hindsight gained from looking into a failing
GETTEXT_POISON build [1], I have to agree with Junio that flat-out
erroring out when GETTEXT_POISON=YesPlease is set is really a hassle
[2].

[1] https://public-inbox.org/git/20181107130950.ga30...@szeder.dev/
(the failure is not related to this patch)
[2] https://public-inbox.org/git/xmqqpnvg8d5z@gitster-ct.c.googlers.com/

> >> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> >> index 78d8c3783b..2f42b3653c 100644
> >> --- a/t/test-lib-functions.sh
> >> +++ b/t/test-lib-functions.sh
> >> @@ -755,16 +755,16 @@ test_cmp_bin() {
> >>
> >>  # Use this instead of test_cmp to compare files that contain expected and
> >>  # actual output from git commands that can be translated.  When running
> >> -# under GETTEXT_POISON this pretends that the command produced expected
> >> +# under GIT_TEST_GETTEXT_POISON this pretends that the command produced 
> >> expected
> >>  # results.
> >>  test_i18ncmp () {
> >> -  test -n "$GETTEXT_POISON" || test_cmp "$@"
> >> +  ! test_have_prereq C_LOCALE_OUTPUT || test_cmp "$@"
> >>  }
> >
> >>  test_i18ngrep () {
> >>eval "last_arg=\${$#}"
> >> @@ -779,7 +779,7 @@ test_i18ngrep () {
> >>error "bug in the test script: too few parameters to 
> >> test_i18ngrep"
> >>fi
> >>
> >> -  if test -n "$GETTEXT_POISON"
> >> +  if test_have_prereq !C_LOCALE_OUTPUT
> >
> > Why do these two helper functions call test_have_prereq (a function
> > that doesn't even fit on my screen) instead of a simple
> >
> >   test -n "$GIT_TEST_GETTEXT_POISON"
> 
> I'm going to keep the call to test_have_prereq, it's consistent with
> other use of the helper in the test lib, and doesn't confuse the reader
> by giving the impression that we're in some really early setup where we
> haven't set the prereq at this point (we have).

Using the prereq in the first place is what confused (and is still
confusing...) the reader.



Re: What's cooking in git.git (Nov 2018, #03; Wed, 7)

2018-11-07 Thread SZEDER Gábor
On Wed, Nov 07, 2018 at 06:41:45PM +0900, Junio C Hamano wrote:
> * nd/i18n (2018-11-06) 16 commits
>  - fsck: mark strings for translation
>  - fsck: reduce word legos to help i18n
>  - parse-options.c: mark more strings for translation
>  - parse-options.c: turn some die() to BUG()
>  - parse-options: replace opterror() with optname()
>  - repack: mark more strings for translation
>  - remote.c: mark messages for translation
>  - remote.c: turn some error() or die() to BUG()
>  - reflog: mark strings for translation
>  - read-cache.c: add missing colon separators
>  - read-cache.c: mark more strings for translation
>  - read-cache.c: turn die("internal error") to BUG()
>  - attr.c: mark more string for translation
>  - archive.c: mark more strings for translation
>  - alias.c: mark split_cmdline_strerror() strings for translation
>  - git.c: mark more strings for translation
> 
>  More _("i18n") markings.

When this patch is merged into 'pu' all four tests added to
't1450-fsck.sh' in b29759d89a (fsck: check HEAD and reflog from other
worktrees, 2018-10-21) as part of 'nd/per-worktree-ref-iteration'
below fail when run with GETTEXT_POISON=y.  The test suite passes in
both of these topics on their own, even with GETTEXT_POISON, it's
their merge that is somehow problematic.

> * nd/per-worktree-ref-iteration (2018-11-05) 9 commits
>   (merged to 'next' on 2018-11-06 at 53803cedf3)
>  + git-worktree.txt: correct linkgit command name
>   (merged to 'next' on 2018-11-03 at 4cbe49a704)
>  + reflog expire: cover reflog from all worktrees
>  + fsck: check HEAD and reflog from other worktrees
>  + fsck: move fsck_head_link() to get_default_heads() to avoid some globals
>  + revision.c: better error reporting on ref from different worktrees
>  + revision.c: correct a parameter name
>  + refs: new ref types to make per-worktree refs visible to all worktrees
>  + Add a place for (not) sharing stuff between worktrees
>  + refs.c: indent with tabs, not spaces
> 
>  The code to traverse objects for reachability, used to decide what
>  objects are unreferenced and expendable, have been taught to also
>  consider per-worktree refs of other worktrees as starting points to
>  prevent data loss.
> 
>  Will merge to 'master'.


Re: Git Test Coverage Report (Friday, Nov 2)

2018-11-03 Thread SZEDER Gábor
On Fri, Nov 02, 2018 at 10:16:48PM -0400, Derrick Stolee wrote:
> Here is the coverage report for today. Some builds were timing out, so I
> removed the tests with number 9000 or more from the build [1]. Hopefully
> this is a temporary measure.

I think it's the Azure CI patch series, see:

https://public-inbox.org/git/20181017232952.gt19...@szeder.dev/
https://public-inbox.org/git/20181021112053.gc30...@szeder.dev/



Re: [PATCH v3] i18n: make GETTEXT_POISON a runtime option

2018-11-02 Thread SZEDER Gábor
On Thu, Nov 01, 2018 at 07:31:15PM +, Ævar Arnfjörð Bjarmason wrote:
> Change the GETTEXT_POISON compile-time + runtime GIT_GETTEXT_POISON
> test parameter to only be a GIT_TEST_GETTEXT_POISON=
> runtime parameter, to be consistent with other parameters documented
> in "Running tests with special setups" in t/README.
> 
> When I added GETTEXT_POISON in bb946bba76 ("i18n: add GETTEXT_POISON
> to simulate unfriendly translator", 2011-02-22) I was concerned with
> ensuring that the _() function would get constant folded if NO_GETTEXT
> was defined, and likewise that GETTEXT_POISON would be compiled out
> unless it was defined.
> 
> But as the benchmark in my [1] shows doing a one-off runtime
> getenv("GIT_TEST_[...]") is trivial, and since GETTEXT_POISON was
> originally added the GIT_TEST_* env variables have become the common
> idiom for turning on special test setups.
> 
> So change GETTEXT_POISON to work the same way. Now the
> GETTEXT_POISON=YesPlease compile-time option is gone, and running the
> tests with GIT_TEST_GETTEXT_POISON=[YesPlease|] can be toggled on/off
> without recompiling.
> 
> This allows for conditionally amending tests to test with/without
> poison, similar to what 859fdc0c3c ("commit-graph: define
> GIT_TEST_COMMIT_GRAPH", 2018-08-29) did for GIT_TEST_COMMIT_GRAPH. Do
> some of that, now we e.g. always run the t0205-gettext-poison.sh test.
> 
> I did enough there to remove the GETTEXT_POISON prerequisite, but its
> inverse C_LOCALE_OUTPUT is still around, and surely some tests using
> it can be converted to e.g. always set GIT_TEST_GETTEXT_POISON=.
> 
> Notes on the implementation:
> 
>  * We still compile a dedicated GETTEXT_POISON build in Travis CI.
>This is probably the wrong thing to do and should be followed-up
>with something similar to ae59a4e44f ("travis: run tests with
>GIT_TEST_SPLIT_INDEX", 2018-01-07) to re-use an existing test setup
>for running in the GIT_TEST_GETTEXT_POISON mode.

I'm of two minds about this.  Sure, now it's not a compile time
option, so we can spare a dedicated compilation, and sparing resources
is good, of course.

However, checking test failures in the 'linux-gcc' build job is always
a bit of a hassle, because it's not enough to look at the output of
the failed test, but I also have to check which one of the two test
runs failed (i.e. the "regular" or one with all the GIT_TEST_* knobs
turned on).  Adding a second test run with GIT_TEST_GETTEXT_POISON
enabled to an other build job (e.g. 'linux-clang') would then bring
this hassle to that build job as well.

Furthermore, occasionally a build job is slower than usual (whatever
the reason might be), which can be an issue when running the test
suite twice, as it can exceed the time limit.

Anyway, we can think about this later.

In any case, GIT_TEST_GETTEXT_POISON definitely should not be enabled
in the same "catch-all" test run with all other GIT_TEST_* variables,
because it would mess up any translated error messages that might pop
up in a test failure, and then we wouldn't have any idea about what
went wrong.

>  * We now skip a test in t-basic.sh under
>GIT_TEST_GETTEXT_POISON=YesPlease that wasn't skipped before. This
>test relies on C locale output, but due to an edge case in how the
>previous implementation of GETTEXT_POISON worked (reading it from
>GIT-BUILD-OPTIONS) wasn't enabling poison correctly. Now it does,
>and needs to be skipped.
> 
>  * The getenv() function is not reentrant, so out of paranoia about
>code of the form:
> 
>printf(_("%s"), getenv("some-env"));
> 
>call use_gettext_poison() in our early setup in git_setup_gettext()
>so we populate the "poison_requested" variable in a codepath that's
>won't suffer from that race condition.
> 
> See also [3] for more on the motivation behind this patch, and the
> history of the GETTEXT_POISON facility.
> 
> 1. https://public-inbox.org/git/871s8gd32p@evledraar.gmail.com/
> 2. https://public-inbox.org/git/87woq7b58k@evledraar.gmail.com/
> 3. https://public-inbox.org/git/878t2pd6yu@evledraar.gmail.com/
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
> 
> Now:
> 
>  * The new i18n helper is gone. We just use "test -n" semantics for
>$GIT_TEST_GETTEXT_POISON
> 
>  * We error out in the Makefile if you're still saying
>GETTEXT_POISON=YesPlease.
> 
>This makes more sense than just making it a synonym since now this
>also needs to be defined at runtime.

OK, I think erroring out is better than silently ignoring
GETTEXT_POISON=YesPlease.  However, the commit message only mentions
that GETTEXT_POISON is gone, but perhaps this should be mentioned
there as well.

>  * The caveat with avoiding test_lazy_prereq is gone (although there's
>still some unrelated bug there worth looking into).
> 
>  * We call use_gettext_poison() really early to avoid any reentrancy
>issue with getenv().

> diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
> 

Re: [PATCH v5 6/7] revision.c: generation-based topo-order algorithm

2018-11-01 Thread SZEDER Gábor
On Thu, Nov 01, 2018 at 01:46:22PM +, Derrick Stolee wrote:
> 1. EXPLORE: using the explore_queue priority queue (ordered by
>maximizing the generation number)

> 2. INDEGREE: using the indegree_queue priority queue (ordered
>by maximizing the generation number)

Nit: I've been pondering for a while what exactly does "order by
maximizing ..." mean.  Highest to lowest or lowest to highest?  If I
understand the rest of the descriptions (that I snipped) correctly,
then it's the former, but I find that phrase in itself too ambiguous.



Re: ab/* topics (was: Re: What's cooking in git.git (Nov 2018, #01; Thu, 1))

2018-11-01 Thread SZEDER Gábor
On Thu, Nov 01, 2018 at 02:46:41PM +0100, Ævar Arnfjörð Bjarmason wrote:
> > However, if you push that patch with 'sh-i18n--helper' as-is, then I
> > do object now: parsing a boolean in shell is not at all that difficult
> > to justify this new command.
> 
> So instead of calling a helper (which is undocumented, and only used by
> git itself internally) you'd instead like to have some shellscript
> thingy like:
> 
> if test $value = 'true' -o $value = '1' []
> then
> exit 0
> elif test $value = 'false' -o $value = '0' [...]

Yes, but more like:

  case "$GIT_TEST_GETTEXT_POISON" in
  yes|true|on|1)
GIT_INTERNAL_GETTEXT_SH_SCHEME=poison
;;
  esac

There is no need to check for 'false', 0, etc.

Yes, I know that this is not as fully-fledged as git_env_bool(), e.g.
it won't recognize 'TrUe' and '42' as true, but since this is "merely"
a testing aid, I think it's sufficient.

> Sure, if that's the consensus I can change that, but it seems like the
> opposite of the direction we're moving with the general *.sh -> *.c
> migration. I.e. implementing helpers whenever possible instead of adding
> new shellscript-only logic.

I doubt that we really want to implement helpers "whenever possible",
i.e. even instead of such a rather trivial piece of shell code.

In the discusson of v1 of that patch there were suggestions on how to
turn an environment variable into a boolean in a more proper way, e.g.
by extending 'git var', but I agree with Peff that "we should do
nothing for now and see how often this comes up" [1].  In the meantime
this builtin seems to be the worse direction of all.

[1] https://public-inbox.org/git/20181025212358.ga23...@sigill.intra.peff.net/



Re: ab/* topics (was: Re: What's cooking in git.git (Nov 2018, #01; Thu, 1))

2018-11-01 Thread SZEDER Gábor
On Thu, Nov 01, 2018 at 12:02:01PM +0100, Ævar Arnfjörð Bjarmason wrote:
> Could you please pick up
> https://public-inbox.org/git/20181024114725.3927-1-ava...@gmail.com/ ?
> It seems to have fallen between the cracks and addressed the feedback on
> v1, and looks good to me (and nobody's objected so far...).

I didn't object, because in

  https://public-inbox.org/git/87muqzllh0@evledraar.gmail.com/

you asked for "a more general review than just the problem of how
we turn an env variable into a boolean".

However, if you push that patch with 'sh-i18n--helper' as-is, then I
do object now: parsing a boolean in shell is not at all that difficult
to justify this new command.



[PATCH] travis-ci: install packages in 'ci/install-dependencies.sh'

2018-11-01 Thread SZEDER Gábor
Ever since we started using Travis CI, we specified the list of
packages to install in '.travis.yml' via the APT addon.  While running
our builds on Travis CI's container-based infrastructure we didn't
have another choice, because that environment didn't support 'sudo',
and thus we didn't have permission to install packages ourselves.  With
the switch to the VM-based infrastructure in the previous patch we do
get a working 'sudo', so we can install packages by running 'sudo
apt-get -y install ...' as well.

Let's make use of this and install necessary packages in
'ci/install-dependencies.sh', so all the dependencies (i.e. both
packages and "non-packages" (P4 and Git-LFS)) are handled in the same
file.  Install gcc-8 only in the 'linux-gcc' build job; so far it has
been unnecessarily installed in the 'linux-clang' build job as well.
Print the versions of P4 and Git-LFS conditionally, i.e. only when
they have been installed; with this change even the static analysis
and documentation build jobs start using 'ci/install-dependencies.sh'
to install packages, and neither of these two build jobs depend on and
thus install those.

This change will presumably be beneficial for the upcoming Azure
Pipelines integration [1]: preliminary versions of that patch series
run a couple of 'apt-get' commands to install the necessary packages
before running 'ci/install-dependencies.sh', but with this patch it
will be sufficient to run only 'ci/install-dependencies.sh'.

[1] 
https://public-inbox.org/git/1a22efe849d6da79f2c639c62a1483361a130238.1539598316.git.gitgitgad...@gmail.com/

Signed-off-by: SZEDER Gábor 
---

This patch should go on top of 'ss/travis-ci-force-vm-mode'.

I'm not sure about the last paragraph, because:

  - It talks about presumed benefits for a currently still
work-in-progress patch series of an other contributor, and I'm not
really sure that that's a good thing.  Perhaps I should have
rather put it below the '---'.

  - I'm confused about the name of this Azure thing.  The cover letter
mentions "Azure Pipelines", the file is called
'azure-pipelines.yml', but the relevant patch I link to talks
about "Azure DevOps" in the commit message.

Anyway, keep that last paragraph or drop it as you see fit.


 .travis.yml| 21 -
 ci/install-dependencies.sh | 35 +--
 2 files changed, 29 insertions(+), 27 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 8d2499739e..a5a82d6832 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -12,16 +12,6 @@ compiler:
   - clang
   - gcc
 
-addons:
-  apt:
-sources:
-- ubuntu-toolchain-r-test
-packages:
-- language-pack-is
-- git-svn
-- apache2
-- gcc-8
-
 matrix:
   include:
 - env: jobname=GETTEXT_POISON
@@ -50,22 +40,11 @@ matrix:
 - env: jobname=StaticAnalysis
   os: linux
   compiler:
-  addons:
-apt:
-  packages:
-  - coccinelle
-  before_install:
   script: ci/run-static-analysis.sh
   after_failure:
 - env: jobname=Documentation
   os: linux
   compiler:
-  addons:
-apt:
-  packages:
-  - asciidoc
-  - xmlto
-  before_install:
   script: ci/test-documentation.sh
   after_failure:
 
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 75a9fd2475..06c3546e1e 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -10,6 +10,15 @@ 
LFSWHENCE=https://github.com/github/git-lfs/releases/download/v$LINUX_GIT_LFS_VE
 
 case "$jobname" in
 linux-clang|linux-gcc)
+   sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
+   sudo apt-get -q update
+   sudo apt-get -q -y install language-pack-is git-svn apache2
+   case "$jobname" in
+   linux-gcc)
+   sudo apt-get -q -y install gcc-8
+   ;;
+   esac
+
mkdir --parents "$P4_PATH"
pushd "$P4_PATH"
wget --quiet "$P4WHENCE/bin.linux26x86_64/p4d"
@@ -32,11 +41,25 @@ osx-clang|osx-gcc)
brew link --force gettext
brew install caskroom/cask/perforce
;;
+StaticAnalysis)
+   sudo apt-get -q update
+   sudo apt-get -q -y install coccinelle
+   ;;
+Documentation)
+   sudo apt-get -q update
+   sudo apt-get -q -y install asciidoc xmlto
+   ;;
 esac
 
-echo "$(tput setaf 6)Perforce Server Version$(tput sgr0)"
-p4d -V | grep Rev.
-echo "$(tput setaf 6)Perforce Client Version$(tput sgr0)"
-p4 -V | grep Rev.
-echo "$(tput setaf 6)Git-LFS Version$(tput sgr0)"
-git-lfs version
+if type p4d >/dev/null && type p4 >/dev/null
+then
+   echo "$(tput setaf 6)Perforce Server Version$(tput sgr0)"
+   p4d -V | grep Rev.
+   echo "$(tput setaf 6)Perforce Client Version$(tput sgr0)"
+   p4

Re: [PATCH] test-lib: introduce the '-V' short option for '--verbose-log'

2018-10-29 Thread SZEDER Gábor
On Mon, Oct 29, 2018 at 10:21:08AM -0400, Jeff King wrote:
> On Mon, Oct 29, 2018 at 01:13:59PM +0100, SZEDER Gábor wrote:
> 
> > '--verbose-log' is one of the most useful and thus most frequently
> > used test options, but due to its length it's a pain to type on the
> > command line.
> > 
> > Let's introduce the corresponding short option '-V' to save some
> > keystrokes.
> 
> Interesting. I'm not opposed to something like this, but I added
> "--verbose-log" specifically for scripted cases, like running an
> unattended "prove" that needs to preserve stdout. When running
> individual tests, I'd just use "-v" itself, and possibly redirect the
> output.
> 
> For my curiosity, can you describe your use case a bit more?

Even when I run individual test scripts by hand, I prefer to have a
file catching all output of the test, because I don't like it when the
test output floods my terminal (especially with '-x'), and because the
file is searchable but the terminal isn't.  And that's exactly what
'--verbose-log' does.

Redirecting the '-v' output (i.e. stdout) alone is insufficient,
because any error messages within the tests and the '-x' trace go to
stderr, so they still end up on the terminal.  Therefore I would have
to remember to redirect stderr every time as well.

I find it's much easier to just always use '--verbose-log'... except
for the length of the option, that is, hence this patch.




[PATCH] test-lib: introduce the '-V' short option for '--verbose-log'

2018-10-29 Thread SZEDER Gábor
'--verbose-log' is one of the most useful and thus most frequently
used test options, but due to its length it's a pain to type on the
command line.

Let's introduce the corresponding short option '-V' to save some
keystrokes.

Signed-off-by: SZEDER Gábor 
---

Or it could be '-L', to emphasize the "log" part of the long option, I
don't really care.

 t/README  | 1 +
 t/test-lib.sh | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/README b/t/README
index 8847489640..2e9bef2852 100644
--- a/t/README
+++ b/t/README
@@ -154,6 +154,7 @@ appropriately before running "make".
As the names depend on the tests' file names, it is safe to
run the tests with this option in parallel.
 
+-V::
 --verbose-log::
Write verbose output to the same logfile as `--tee`, but do
_not_ write it to stdout. Unlike `--tee --verbose`, this option
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 897e6fcc94..47a99aa0ed 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -67,7 +67,7 @@ case "$GIT_TEST_TEE_STARTED, $* " in
 done,*)
# do not redirect again
;;
-*' --tee '*|*' --va'*|*' --verbose-log '*)
+*' --tee '*|*' --va'*|*' -V '*|*' --verbose-log '*)
mkdir -p "$TEST_OUTPUT_DIRECTORY/test-results"
BASE="$TEST_OUTPUT_DIRECTORY/test-results/$(basename "$0" .sh)"
 
@@ -316,7 +316,7 @@ do
echo >&2 "warning: ignoring -x; '$0' is untraceable 
without BASH_XTRACEFD"
fi
shift ;;
-   --verbose-log)
+   -V|--verbose-log)
verbose_log=t
shift ;;
*)
-- 
2.19.1.723.g6817e59acc



Re: 'ds/test-multi-pack-index' vs. 'ab/commit-graph-progress'

2018-10-29 Thread SZEDER Gábor
On Thu, Oct 25, 2018 at 11:11:55PM +0200, Ævar Arnfjörð Bjarmason wrote:
> > Anyway, this test seems to be too fragile, because that
> >
> >   test_line_count = 1 stderr
> 
> Yeah maybe it's too fragile, on the other hand it caught what seems to
> be a bug under GIT_TEST_MULTI_PACK_INDEX=true, and the rest of the test
> suite passes, so there's that.

I can image more prudent approaches that would have done the same,
e.g. '! grep ^error stderr'.

> > line will trigger, when anything else during 'git gc' prints
> > something.  And I find it quite strange that an option called
> > '--no-quiet' only shows the commit-graph progress, but not the regular
> > output of 'git gc'.
> 
> It's confusing, but the reason for this seeming discrepancy is that
> writing the commit-graph happens in-process, whereas the rest of the
> work done by git-gc (and its output) comes from subprocesses. Most of
> that output is from "git-repack" / "git-pack-objects" which doesn't pay
> the same attention to --quiet and --no-quiet, instead it checks
> isatty(). See cmd_pack_objects().

This explains what implementation details led to the current
behaviour, but does not justify the actual inconsistency.



Re: [PATCH] packfile: close multi-pack-index in close_all_packs

2018-10-29 Thread SZEDER Gábor
On Thu, Oct 25, 2018 at 12:54:05PM +, Derrick Stolee wrote:
> Whenever we delete pack-files from the object directory, we need
> to also delete the multi-pack-index that may refer to those
> objects. Sometimes, this connection is obvious, like during a
> repack. Other times, this is less obvious, like when gc calls
> a repack command and then does other actions on the objects, like
> write a commit-graph file.
> 
> The pattern we use to avoid out-of-date in-memory packed_git
> structs is to call close_all_packs(). This should also call
> close_midx(). Since we already pass an object store to
> close_all_packs(), this is a nicely scoped operation.
> 
> This fixes a test failure when running t6500-gc.sh with
> GIT_TEST_MULTI_PACK_INDEX=1.
> 
> Reported-by: Szeder Gábor 
> Signed-off-by: Derrick Stolee 
> ---
> 
> Thanks for the report, Szeder! I think this is the correct fix,
> and more likely to be permanent across all builtins that run
> auto-GC. I based it on ds/test-multi-pack-index so it could easily
> be added on top.

OK, avoiding those errors in the first place is good, of course...
However, I still find it disconcerting that those errors didn't cause
'git gc' to error out, and, consequently, what other MIDX-related
errors/bugs might do the same.

> -Stolee
> 
>  packfile.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/packfile.c b/packfile.c
> index 841b36182f..37fcd8f136 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -339,6 +339,11 @@ void close_all_packs(struct raw_object_store *o)
>   BUG("want to close pack marked 'do-not-close'");
>   else
>   close_pack(p);
> +
> + if (o->multi_pack_index) {
> + close_midx(o->multi_pack_index);
> + o->multi_pack_index = NULL;
> + }
>  }
>  
>  /*
> 
> base-commit: 0465a50506023df0932fe0534fe6ac6712c0d854
> -- 
> 2.17.1
> 


Re: [PATCH 12/12] fsck: mark strings for translation

2018-10-29 Thread SZEDER Gábor
On Sun, Oct 28, 2018 at 07:51:57AM +0100, Nguyễn Thái Ngọc Duy wrote:
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  builtin/fsck.c | 113 -
>  t/t1410-reflog.sh  |   6 +-
>  t/t1450-fsck.sh|  50 
>  t/t6050-replace.sh |   4 +-
>  t/t7415-submodule-names.sh |   6 +-
>  5 files changed, 94 insertions(+), 85 deletions(-)
> 
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 06eb421720..13f8fe35c5 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -108,8 +108,9 @@ static int fsck_config(const char *var, const char 
> *value, void *cb)
>  static void objreport(struct object *obj, const char *msg_type,
>   const char *err)
>  {
> - fprintf(stderr, "%s in %s %s: %s\n",
> - msg_type, printable_type(obj), describe_object(obj), err);
> + fprintf_ln(stderr, _("%s in %s %s: %s"),

Are the (f)printf() -> (f)printf_ln() changes all over
'builtin/fsck.c' really necessary to mark strings for translation?

> diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
> index 90c765df3a..500c21366e 100755
> --- a/t/t1450-fsck.sh
> +++ b/t/t1450-fsck.sh

> @@ -594,7 +594,7 @@ test_expect_success 'fsck --name-objects' '
>   remove_object $(git rev-parse julius:caesar.t) &&
>   test_must_fail git fsck --name-objects >out &&
>   tree=$(git rev-parse --verify julius:) &&
> - egrep "$tree \((refs/heads/master|HEAD)@\{[0-9]*\}:" out
> + test -n "$GETTEXT_POISON" || egrep "$tree 
> \((refs/heads/master|HEAD)@\{[0-9]*\}:" out

'test_i18ngrep' accepts all 'grep' options, so this could be written
as:

  test_i18ngrep -E "..." out


There might be something else wrong with the patch, because now this
test tends to fail on current 'pu' on Travis CI:

  [... snipped output from 'test_commit' ...]
  +git rev-parse julius:caesar.t
  +remove_object be45bbd3809e0829297cefa576e699c134abacfd
  +sha1_file be45bbd3809e0829297cefa576e699c134abacfd
  +remainder=45bbd3809e0829297cefa576e699c134abacfd
  +firsttwo=be
  +echo .git/objects/be/45bbd3809e0829297cefa576e699c134abacfd
  +rm .git/objects/be/45bbd3809e0829297cefa576e699c134abacfd
  +test_must_fail git fsck --name-objects
  +_test_ok=
  +git fsck --name-objects
  +exit_code=2
  +test 2 -eq 0
  +test_match_signal 13 2
  +test 2 = 141
  +test 2 = 269
  +return 1
  +test 2 -gt 129
  +test 2 -eq 127
  +test 2 -eq 126
  +return 0
  +git rev-parse --verify julius:
  +tree=c2fab98f409a47394d992eca10a20e0b22377c0c
  +test -n 
  +egrep c2fab98f409a47394d992eca10a20e0b22377c0c 
\((refs/heads/master|HEAD)@\{[0-9]*\}: out
  error: last command exited with $?=1
  not ok 65 - fsck --name-objects

The contents of 'out':

  broken link fromtree be45bbd3809e0829297cefa576e699c134abacfd 
(refs/heads/master@{1112912113}:caesar.t)
toblob be45bbd3809e0829297cefa576e699c134abacfd 
(refs/heads/master@{1112912113}:caesar.t)
  missing blob be45bbd3809e0829297cefa576e699c134abacfd 
(refs/heads/master@{1112912113}:caesar.t)

On a related (side)note, I think it would be better if this 'grep'
pattern were more explicit about which of the three lines it is
supposed to match.


Anyway, I couldn't reproduce the failure locally yet, but, admittedly,
I didn't try that hard...  And my builds on Travis CI haven't yet come
that far to try this topic on its own.



Re: t7405.17 breakage vanishes with GETTEXT_POISON=1

2018-10-28 Thread SZEDER Gábor
On Sun, Oct 28, 2018 at 06:41:06AM +0100, Duy Nguyen wrote:
> Something fishy is going on but I don't think I'll spend time hunting
> it down so I post here in case somebody else is interested. It might
> also indicate a problem with poison gettext, not the test case too.

I haven't actually run the test under GETTEXT_POISON, but I stongly
suspect it's the test, or more accurately the helper function
'test_i18ngrep'.

The test in question runs

  test_i18ngrep ! "refusing to lose untracked file at" err

which fails in normal test runs, because 'grep' does find the
undesired string; that's the known breakage.  Under GETTEXT_POISION,
however, 'test_i18ngrep' always succeeds because of this condition:

  if test -n "$GETTEXT_POISON"
  then
  # pretend success
  return 0
  fi

and then in turn the whole test succeeds.


Re: [PATCH] travis-ci: no longer use containers

2018-10-25 Thread SZEDER Gábor
On Fri, Oct 26, 2018 at 09:09:48AM +0900, Junio C Hamano wrote:
> Sebastian Staudt  writes:
> 
> > Travis CI will soon deprecate the container-based infrastructure
> > enabled by `sudo: false` in ce59dffb34190e780be2fa9f449f842cadee9753.
> >
> > More info:
> > https://blog.travis-ci.com/2018-10-04-combining-linux-infrastructures
> 
> Thanks for posting a patch that would serve as a good discussion
> starter.  This is not a criticism on your patch, but more is a RFD
> to those who helped our use of Travis by contributing to .travis.yml
> and ci/.
> 
> Don't we need to do some other things so that we can run in vm
> environment, rather than in container environment, before doing this
> change?  IOW, aren't we doing in .travis.yml something we can do
> only in container but not in vm (if there is any), and if so,
> shouldn't we be rewriting that something so that we can run in vm?

As far as I understand, the container-based infrastructure has only
one benefit over the VMs, the shorter startup time.

OTOH, in VMs we can use sudo, which is not available in the
container-based intra.  This has the benefit that after switching to
VMs, we'll be able to install packages by running 'sudo apt-get
install ...'.  Currently the necessary packages are listed in
'.travis.yml' for Travis CI, while for Azure the whole install command
is embedded in '.azureyml'.  After the switch we could consolidate
installing packages by 'sudo apt-get...' in
'ci/install-dependencies.sh' for both.

> I know ce59dffb ("travis-ci: explicity use container-based
> infrastructure", 2016-01-26) only added "sudo: false" without doing
> anything else (e.g. adding things that are only available to those
> who run in container), but if we added stuff that are not usable in
> vm environment after that commit since then, we need to adjust them
> so that we can migrate to the container-based environment, no?
> 
> To me, removing that "sudo: false" line seems like the least thing
> we need to worry about.  After all, they say that whether we have
> "sudo: false" or not, the CI jobs will start running in vm
> environment and not in container.  So if the rest of .travis.yml is
> ready to run in vm environment, we do not have to do anything ;-).
> 
> In short, my question to Lars and SZEDER is, are we already prepared
> to be thrown into a vm environment?

I think we are.  I've run only two builds with this patch, and they
run smoothly and finished successfully.  After you update 'pu' I'll
run more.

> If the answer is "yes", then I think removing "sudo: false" is
> probably still a good thing to do for documentation purposes
> (i.e. showing that we knew we are ready to go through their
> migration).

I agree.


> > Signed-off-by: Sebastian Staudt 
> > ---
> >  .travis.yml | 2 --
> >  1 file changed, 2 deletions(-)
> >
> > diff --git a/.travis.yml b/.travis.yml
> > index 4d4e26c9df..8d2499739e 100644
> > --- a/.travis.yml
> > +++ b/.travis.yml
> > @@ -1,7 +1,5 @@
> >  language: c
> >
> > -sudo: false
> > -
> >  cache:
> >directories:
> >  - $HOME/travis-cache
> > --
> > 2.19.1


Re: [PATCH v3 2/3] khash: silence -Wunused-function in delta-islands from khash

2018-10-25 Thread SZEDER Gábor
On Thu, Oct 25, 2018 at 04:04:26AM -0700, Carlo Marcelo Arenas Belón wrote:
> showing the following when compiled with latest clang (OpenBSD, Fedora
> and macOS):

s/^s/S/
This applies to your other commit messages as well.

But more importantly, please be explicit about the compiler version
that emits the warning, so others won't have to guess when stumbling
upon this commit in a few months or years time.

> delta-islands.c:23:1: warning: unused function 'kh_destroy_str'
>   [-Wunused-function]
> delta-islands.c:23:1: warning: unused function 'kh_clear_str'
>   [-Wunused-function]
> delta-islands.c:23:1: warning: unused function 'kh_del_str' 
> [-Wunused-function]
> 
> Reported-by: René Scharfe 
> Suggested-by: Nguyễn Thái Ngọc Duy 
> Signed-off-by: Carlo Marcelo Arenas Belón 
> Signed-off-by: Junio C Hamano 
> ---
>  khash.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/khash.h b/khash.h
> index d10caa0c35..532109c87f 100644
> --- a/khash.h
> +++ b/khash.h
> @@ -234,7 +234,7 @@ static const double __ac_HASH_UPPER = 0.77;
>   __KHASH_IMPL(name, SCOPE, khkey_t, khval_t, kh_is_map, __hash_func, 
> __hash_equal)
>  
>  #define KHASH_INIT(name, khkey_t, khval_t, kh_is_map, __hash_func, 
> __hash_equal) \
> - KHASH_INIT2(name, static inline, khkey_t, khval_t, kh_is_map, 
> __hash_func, __hash_equal)
> + KHASH_INIT2(name, MAYBE_UNUSED static inline, khkey_t, khval_t, 
> kh_is_map, __hash_func, __hash_equal)
>  
>  /* Other convenient macros... */
>  
> -- 
> 2.19.1
> 


'ds/test-multi-pack-index' vs. 'ab/commit-graph-progress'

2018-10-25 Thread SZEDER Gábor
Hi,

when branch 'ds/test-multi-pack-index' is merged with
'ab/commit-graph-progress', IOW 'master', 'next', or 'pu',
'GIT_TEST_MULTI_PACK_INDEX=1 ./t6500-gc.sh' fails with:

  expecting success:
  git -c gc.writeCommitGraph=true gc --no-quiet >stdout 2>stderr &&
  test_must_be_empty stdout &&
  test_line_count = 1 stderr &&
  test_i18ngrep "Computing commit graph generation numbers" stderr
  
  + git -c gc.writeCommitGraph=true gc --no-quiet
  + test_must_be_empty stdout
  + test_path_is_file stdout
  + test -f stdout
  + test -s stdout
  + test_line_count = 1 stderr
  + test 3 != 3
  + wc -l
  + test 16 = 1
  + echo test_line_count: line count for stderr != 1
  test_line_count: line count for stderr != 1
  + cat stderr
  error: packfile 
.git/objects/pack/pack-c67996b982e718f8e3fa70c5ff7db3cecf688bbb.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-d4f2632c6a37149bb546b8b0cfbc56b8183cd0f8.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-d4f2632c6a37149bb546b8b0cfbc56b8183cd0f8.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-c67996b982e718f8e3fa70c5ff7db3cecf688bbb.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-c67996b982e718f8e3fa70c5ff7db3cecf688bbb.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-c67996b982e718f8e3fa70c5ff7db3cecf688bbb.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-c67996b982e718f8e3fa70c5ff7db3cecf688bbb.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-c67996b982e718f8e3fa70c5ff7db3cecf688bbb.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-d4f2632c6a37149bb546b8b0cfbc56b8183cd0f8.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-d4f2632c6a37149bb546b8b0cfbc56b8183cd0f8.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-d4f2632c6a37149bb546b8b0cfbc56b8183cd0f8.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-d4f2632c6a37149bb546b8b0cfbc56b8183cd0f8.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-d4f2632c6a37149bb546b8b0cfbc56b8183cd0f8.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-d4f2632c6a37149bb546b8b0cfbc56b8183cd0f8.pack index 
unavailable
  error: packfile 
.git/objects/pack/pack-c67996b982e718f8e3fa70c5ff7db3cecf688bbb.pack index 
unavailable
  Computing commit graph generation numbers:  25% (1/4)   ^MComputing commit 
graph generation numbers:  50% (2/4)   ^MComputing commit graph generation 
numbers:  75% (3/4)   ^MComputing commit graph generation numbers: 100% (4/4)   
^MComputing commit graph generation numbers: 100% (4/4), done.
  + return 1
  error: last command exited with $?=1
  not ok 9 - gc --no-quiet


I suspect these "packfile index unavailable" errors are a Bad Thing,
but I didn't follow the MIDX development closely enough to judge.
Surprisingly (to me), 'git gc' didn't exit with error despite these
errors.

Anyway, this test seems to be too fragile, because that

  test_line_count = 1 stderr

line will trigger, when anything else during 'git gc' prints
something.  And I find it quite strange that an option called
'--no-quiet' only shows the commit-graph progress, but not the regular
output of 'git gc'.




Re: [PATCH 18/19] submodule: use submodule repos for object lookup

2018-10-25 Thread SZEDER Gábor
On Tue, Oct 16, 2018 at 04:35:49PM -0700, Stefan Beller wrote:
> This converts the 'show_submodule_header' function to use
> the repository API properly, such that the submodule objects
> are not added to the main object store.

This patch breaks the test suite with 'GIT_TEST_COMMIT_GRAPH=1', in
particular 't4041-diff-submodule-option.sh' fails with:

  expecting success:
  git diff-index -p --submodule=log HEAD >actual &&
  cat >expected <<-EOF &&
  Submodule sm1 $head2..$head3 (rewind):
< Add foo3 ($added foo3)
< Add foo2 ($added foo2)
  EOF
  test_cmp expected actual
  
  + git diff-index -p --submodule=log HEAD
  + cat
  + test_cmp expected actual
  + diff -u expected actual
  --- expected2018-10-25 09:10:00.541610016 +
  +++ actual  2018-10-25 09:10:00.537609885 +
  @@ -1,3 +1,5 @@
  -Submodule sm1 30b9670..dafb207 (rewind):
  +Submodule sm1 30b9670...dafb207:
 < Add foo3 (hinzugefügt foo3)
 < Add foo2 (hinzugefügt foo2)
  +  > Add foo1 (hinzugefügt foo1)
  +  < Add foo1 (hinzugefügt foo1)
  error: last command exited with $?=1
  not ok 9 - modified submodule(backward)

and 't4060-diff-submodule-option-diff-format.sh' fails with:

  expecting success:
  git diff-index -p --submodule=diff HEAD >actual &&
  cat >expected <<-EOF &&
  Submodule sm1 $head2..$head3 (rewind):
  diff --git a/sm1/foo2 b/sm1/foo2
  deleted file mode 100644
  index 54b060e..000
  --- a/sm1/foo2
  +++ /dev/null
  @@ -1 +0,0 @@
  -foo2
  diff --git a/sm1/foo3 b/sm1/foo3
  deleted file mode 100644
  index c1ec6c6..000
  --- a/sm1/foo3
  +++ /dev/null
  @@ -1 +0,0 @@
  -foo3
  EOF
  test_cmp expected actual
  
  + git diff-index -p --submodule=diff HEAD
  + cat
  + test_cmp expected actual
  + diff -u expected actual
  --- expected2018-10-25 09:10:38.854868800 +
  +++ actual  2018-10-25 09:10:38.854868800 +
  @@ -1,4 +1,4 @@
  -Submodule sm1 30b9670..dafb207 (rewind):
  +Submodule sm1 30b9670...dafb207:
   diff --git a/sm1/foo2 b/sm1/foo2
   deleted file mode 100644
   index 54b060e..000
  error: last command exited with $?=1
  not ok 10 - modified submodule(backward)




Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]

2018-10-24 Thread SZEDER Gábor
On Mon, Oct 22, 2018 at 11:54:06AM -0700, Stefan Beller wrote:

> For the sake of a good history, I would think running 'make coccicheck'
> and applying the resulting patches would be best as part of the (dirty)
> merge of any topic that proposes new semantic patches, but that would
> add load to Junio as it would be an extra step during the merge.
> 
> One could argue that the step of applying such transformations into
> the dirty merge is cheaper than resolving merge conflicts that are
> had when the topic includes the transformation.

Please consider that merge commits' have uglier diffs than regular
commits, and that merge commits cause additional complications when
'git bisect' points the finger at them, both of which are exacerbated
by additional changes squeezed into evil merges.

> > Consequently, 'make coccicheck' won't run clean and the
> > static analysis build job will fail until all those topics reach
> > 'master', and the remaining transformations are applied on top.
> >
> > This was (and still is!) an issue with the hasheq()/oideq() series
> > as well: that series was added on 2018-08-28, and the static
> > analysis build job is red on 'pu' ever since.  See the follow-up
> > patch e43d2dcce1 (more oideq/hasheq conversions, 2018-10-02), and
> > one more follow-up will be necessary after the builtin stash topic
> > is merged to 'master'.
> 
> In my understanding this follow up is a feature, as it helps to avoid
> merge conflicts with other topics in flight.

I don't see how such a follow up patch helps to avoid merge conflicts.

There were topics that branched off before the introduction of oideq()
into 'master', therefore they couldn't make use of this new function
until they were merged to 'master' as well, so they added their own
!oidcmp() calls.  That follow up patch was necessary to transform
these new !oidcmp() calls after those topics reached 'master'.  Merge
conflicts had nothing to do with it.

So this follow up patch is not a feature, but rather an inherent
consequence of the project's branching model, with lots of parallel
running topics branching off at different points and progressing at
different speeds.

> > This makes it harder to review other patch series.
> 
> as 'make coccicheck' is an integral part of your review?

Erm, right, "review" was not the right word here.  Anyway, as it is,
'make coccicheck' is an integral part of our automated tests, not only
on Travis CI but on the upcoming Azure thing as well.  I just try to
pay attention to its results and the results of a bunch of my
additional builds, and complain or even send a fix when something goes
reproducibly wrong.  This has certainly became more cumbersome with
the permanently failing static analysis build job in the last couple
of weeks.

> > How about introducing the concept of "pending" semantic patches,
> > stored in 'contrib/coccinelle/.pending.cocci' files, modifying
> > 'make coccicheck' to skip them, and adding the new 'make
> > coccicheck-pending' target to make it convenient to apply them, e.g.
> > something like the simple patch at the end.
> >
> > So the process would go something like this:
> >
> >   - A new semantic patch should be added as "pending", e.g. to the
> > file 'the_repository.pending.cocci', together with the resulting
> > transformations in the same commit.
> >
> > This way neither 'make coccicheck' nor the static analysis build
> > job would complain in the topic branch or in the two integration
> > branches.  And if they do complain, then we would know right away
> > that they complain because of a well-established semantic patch.
> > Yet, anyone interested could run 'make coccicheck-pending' to see
> > where are we heading.
> >
> >   - The author of the "pending" semanting patch should then keep an
> > eye on already cooking topics: whether any of them contain new
> > code that should be transformed, and how they progress to
> > 'master', and sending followup patch(es) with the remaining
> > transformations when applicable.
> >
> > Futhermore, the author should also pay attention to any new topics
> > that branch off after the "pending" semantic patch, and whether
> > any of them introduce code to be transformed, warning their
> > authors as necessary.
> >
> >   - Finally, after all the dust settled, the dev should follow up with
> > a patch to:
> >
> >   - promote the "penging" patch to '.cocci', if its purpose
> > is to avoid undesirable code patterns in the future, or
> >
> >   - remove the semantic patch, if it was used in a one-off
> > transformation.
> >
> > Thoughts?
> 
> I like the approach of having separate classes of semantic patches:
> (a) the regular "we need to keep checking these" as they address
> undesirable code patterns, which is what we currently have,
> and what 'make coccicheck' would complain about.
> (b) The pending patches as 

Re: [PATCH v2 3/3] rebase (autostash): use an explicit OID to apply the stash

2018-10-22 Thread SZEDER Gábor
On Mon, Oct 22, 2018 at 03:15:05PM -0700, Johannes Schindelin via GitGitGadget 
wrote:
> From: Johannes Schindelin 
> 
> When `git stash apply ` sees an argument that consists only of
> digits, it tries to be smart and interpret it as `stash@{}`.
> 
> Unfortunately, an all-digit hash (which is unlikely but still possible)
> is therefore misinterpreted as `stash@{}` reflog.
> 
> To prevent that from happening, let's append `^0` after the stash hash,
> to make sure that it is interpreted as an OID rather than as a number.

Oh, this is clever.

FWIW, all patches look good to me, barring the typo below.

> Signed-off-by: Johannes Schindelin 
> ---
>  builtin/rebase.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 418624837..30d58118c 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -253,6 +253,8 @@ static int apply_autostash(struct rebase_options *opts)
>  
>   if (read_one(path, ))
>   return error(_("Could not read '%s'"), path);
> + /* Ensure that the hash is not mistake for a number */

s/mistake/mistaken/

> + strbuf_addstr(, "^0");
>   argv_array_pushl(_apply.args,
>"stash", "apply", autostash.buf, NULL);
>   stash_apply.git_cmd = 1;
> -- 
> gitgitgadget


[PATCH 6/8] gettext: use an enum for the mode of GETTEXT POISONing

2018-10-22 Thread SZEDER Gábor
The next patch will add a different mode of GETTEXT POISON-ing,
therefore named constants will be better than magic numbers.

Signed-off-by: SZEDER Gábor 
---
 gettext.c | 12 ++--
 gettext.h | 12 +---
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/gettext.c b/gettext.c
index a9509a5df3..c50d1e0377 100644
--- a/gettext.c
+++ b/gettext.c
@@ -47,17 +47,17 @@ const char *get_preferred_languages(void)
 }
 
 #ifdef GETTEXT_POISON
-int use_gettext_poison(void)
+enum poison_mode use_gettext_poison(void)
 {
-   static int poison_requested = -1;
-   if (poison_requested == -1) {
+   static enum poison_mode poison_mode = poison_mode_uninitialized;
+   if (poison_mode == poison_mode_uninitialized) {
const char *v = getenv("GIT_GETTEXT_POISON");
if (v && *v)
-   poison_requested = 1;
+   poison_mode = poison_mode_default;
else
-   poison_requested = 0;
+   poison_mode = poison_mode_none;
}
-   return poison_requested;
+   return poison_mode;
 }
 #endif
 
diff --git a/gettext.h b/gettext.h
index 8e279622f6..fcb6bfaa2c 100644
--- a/gettext.h
+++ b/gettext.h
@@ -42,7 +42,13 @@ static inline int gettext_width(const char *s)
 #endif
 
 #ifdef GETTEXT_POISON
-extern int use_gettext_poison(void);
+enum poison_mode {
+   poison_mode_uninitialized = -1,
+   poison_mode_none = 0,
+   poison_mode_default
+};
+
+extern enum poison_mode use_gettext_poison(void);
 
 #define GETTEXT_POISON_MAGIC "# GETTEXT POISON #"
 #endif
@@ -52,7 +58,7 @@ static inline FORMAT_PRESERVING(1) const char *_(const char 
*msgid)
if (!*msgid)
return "";
 #ifdef GETTEXT_POISON
-   if (use_gettext_poison())
+   if (use_gettext_poison() == poison_mode_default)
return GETTEXT_POISON_MAGIC;
 #endif
return gettext(msgid);
@@ -62,7 +68,7 @@ static inline FORMAT_PRESERVING(1) FORMAT_PRESERVING(2)
 const char *Q_(const char *msgid, const char *plu, unsigned long n)
 {
 #ifdef GETTEXT_POISON
-   if (use_gettext_poison())
+   if (use_gettext_poison() == poison_mode_default)
return GETTEXT_POISON_MAGIC;
 #endif
return ngettext(msgid, plu, n);
-- 
2.19.1.681.g6bd79da3f5



[PATCH 4/8] gettext: #ifdef away GETTEXT POISON-related code from _() and Q_()

2018-10-22 Thread SZEDER Gábor
The gettext wrapper functions _() and Q_() contain a GETTEXT
POISON-related conditional construct even in non-GETTEXT POISON
builds, though both of those conditions are #define-d to be false
already at compile time.  Both constructs will grow in a later patch,
using a GETTEXT POISON-specific enum type and calling another GETTEXT
POISON-specific function.

Prepare for those future changes and hide the GETTEXT POISON-related
parts of those functions behind an #ifdef.

Signed-off-by: SZEDER Gábor 
---
 gettext.h | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/gettext.h b/gettext.h
index 7eee64a34f..c658942f7d 100644
--- a/gettext.h
+++ b/gettext.h
@@ -43,22 +43,26 @@ static inline int gettext_width(const char *s)
 
 #ifdef GETTEXT_POISON
 extern int use_gettext_poison(void);
-#else
-#define use_gettext_poison() 0
 #endif
 
 static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 {
if (!*msgid)
return "";
-   return use_gettext_poison() ? "# GETTEXT POISON #" : gettext(msgid);
+#ifdef GETTEXT_POISON
+   if (use_gettext_poison())
+   return "# GETTEXT POISON #";
+#endif
+   return gettext(msgid);
 }
 
 static inline FORMAT_PRESERVING(1) FORMAT_PRESERVING(2)
 const char *Q_(const char *msgid, const char *plu, unsigned long n)
 {
+#ifdef GETTEXT_POISON
if (use_gettext_poison())
return "# GETTEXT POISON #";
+#endif
return ngettext(msgid, plu, n);
 }
 
-- 
2.19.1.681.g6bd79da3f5



[PATCH 5/8] gettext: put "# GETTEXT POISON #" string literal into a macro

2018-10-22 Thread SZEDER Gábor
The "# GETTEXT POISON #" string literal is currently used in two
functions in 'gettext.h'.  A later patch will add a function to
'gettext.c' using this string literal as well.

Avoid this duplication and put that string literal into a macro which
is only available in GETTEXT POISON builds.

Signed-off-by: SZEDER Gábor 
---
 gettext.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gettext.h b/gettext.h
index c658942f7d..8e279622f6 100644
--- a/gettext.h
+++ b/gettext.h
@@ -43,6 +43,8 @@ static inline int gettext_width(const char *s)
 
 #ifdef GETTEXT_POISON
 extern int use_gettext_poison(void);
+
+#define GETTEXT_POISON_MAGIC "# GETTEXT POISON #"
 #endif
 
 static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
@@ -51,7 +53,7 @@ static inline FORMAT_PRESERVING(1) const char *_(const char 
*msgid)
return "";
 #ifdef GETTEXT_POISON
if (use_gettext_poison())
-   return "# GETTEXT POISON #";
+   return GETTEXT_POISON_MAGIC;
 #endif
return gettext(msgid);
 }
@@ -61,7 +63,7 @@ const char *Q_(const char *msgid, const char *plu, unsigned 
long n)
 {
 #ifdef GETTEXT_POISON
if (use_gettext_poison())
-   return "# GETTEXT POISON #";
+   return GETTEXT_POISON_MAGIC;
 #endif
return ngettext(msgid, plu, n);
 }
-- 
2.19.1.681.g6bd79da3f5



[PATCH 7/8] gettext: introduce GIT_GETTEXT_POISON=scrambled

2018-10-22 Thread SZEDER Gábor
Sometimes tests run with GETTEXT POISON fail because of a reason other
than a translated string that should not have been translated.  In
such a case an error message from a git command in the test's verbose
output is usually, well, less than helpful:

  error: # GETTEXT POISON #

or

  fatal: # GETTEXT POISON #: No such file or directory

It's especially annoying on those rare occasions when a heisenbug
decides it's a good time to suddenly reveal its presence during a
GETTEXT POISON test run, and all we get is an error message like these
(yes, I did actually see both of the above error messages only once).

Make builtin commands' GETTEXT POISON-ed error messages more useful
for debugging failures by introducing a new mode of poisoning: if
$GIT_GETTEXT_POISON is set to 'scrambled', then include the original
untranslated message after that "# GETTEXT_POISON #" string in a
scrambled form, interspersing a '.' after each character.  This way
the messages will remain gibberish enough for machine consumption as
they were before, but at the same time they will be relatively easily
legible for humans.  Take extra care to preserve printf() format
conversion specifiers unaltered when inserting those dots.

Leave 'git-sh-i18n.sh' unchanged, because translatable messages in
scripts often include shell variables, and they could (though
currently they don't) include printf format specifiers, parameter
expansions, command substitutions and whatnot, too.  Dealing with
those in a shell script would be too much hassle without its worth.

There is an additional benefit: as this change considerably increases
the size of translated messages, it could detect cases when we try to
format a translated string into a too small buffer.  E.g. this change
applied on old versions causes test failures because of the bug that
was fixed in 2cfa83574c (bisect_next_all: convert xsnprintf to
xstrfmt, 2017-02-16).

[TODO: Fallout?
   A 'printf(_("foo: %s"), var);' call includes the contents of
   'var' unscrambled in the output.  Could that hide the
   translation of a string that should not have been translated?
   I'm afraid yes: to check the output of that printf() a sloppy
   test could do:

 git plumbing-cmd >out && grep "var's content" out

   which would fail in a regular GETTEXT_POISON test run, but
   would succeed in a scrambled test run.  Does this matter in
   practice, do we care at all?

   Does gettext_scramble() need a FORMAT_PRESERVING annotation?
   Seems to work fine without it so far...]

Signed-off-by: SZEDER Gábor 
---
 gettext.c | 54 +++---
 gettext.h | 11 +--
 2 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/gettext.c b/gettext.c
index c50d1e0377..8ba7fd0bea 100644
--- a/gettext.c
+++ b/gettext.c
@@ -52,13 +52,61 @@ enum poison_mode use_gettext_poison(void)
static enum poison_mode poison_mode = poison_mode_uninitialized;
if (poison_mode == poison_mode_uninitialized) {
const char *v = getenv("GIT_GETTEXT_POISON");
-   if (v && *v)
-   poison_mode = poison_mode_default;
-   else
+   if (v && *v) {
+   if (!strcmp(v, "scrambled"))
+   poison_mode = poison_mode_scrambled;
+   else
+   poison_mode = poison_mode_default;
+   } else
poison_mode = poison_mode_none;
}
return poison_mode;
 }
+
+static int conversion_specifier_len(const char *s)
+{
+   const char printf_conversion_specifiers[] = "diouxXeEfFgGaAcsCSpnm%";
+   const char *format_end;
+
+   if (*s != '%')
+   return 0;
+
+   format_end = strpbrk(s + 1, printf_conversion_specifiers);
+   if (format_end)
+   return format_end - s;
+   else
+   return 0;
+}
+
+const char *gettext_scramble(const char *msg)
+{
+   struct strbuf sb;
+
+   strbuf_init(,
+   /* "# GETTEXT_POISON #" + ' ' + "m.e.s.s.a.g.e." + '\0' */
+   strlen(GETTEXT_POISON_MAGIC) + 1 + 2 * strlen(msg) + 1);
+
+   strbuf_addch(, ' ');
+   while (*msg) {
+   if (*msg == '\n') {
+   strbuf_addch(, *(msg++));
+   continue;
+   } else if (*msg == '%') {
+   int spec_len = conversion_specifier_len(msg);
+   if (spec_len) {
+   strbuf_add(, msg, spec_len);
+   msg += spec_len;
+   continue;
+   }
+   }
+
+   strbuf_addch(, *(msg++));
+   strbuf_addch(, '.');
+   }
+
+   /* This will be leaked... */
+   

[PATCH 8/8] travis-ci: run GETTEXT POISON build job in scrambled mode, too

2018-10-22 Thread SZEDER Gábor
Run the test suite twice in the GETTEXT POISON build: first with
GIT_GETTEXT_POISON=scrambled and then with "regular" poisoning, to see
whether the scrambled mode hid any mis-translations.

Signed-off-by: SZEDER Gábor 
---
 ci/lib-travisci.sh|  1 +
 ci/run-build-and-tests.sh | 10 +++---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/ci/lib-travisci.sh b/ci/lib-travisci.sh
index 109ef280da..fdfa4e035b 100755
--- a/ci/lib-travisci.sh
+++ b/ci/lib-travisci.sh
@@ -122,5 +122,6 @@ osx-clang|osx-gcc)
;;
 GETTEXT_POISON)
export GETTEXT_POISON=YesPlease
+   export GIT_GETTEXT_POISON=scrambled
;;
 esac
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 3735ce413f..74ba05e152 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -9,10 +9,14 @@ ln -s "$cache_dir/.prove" t/.prove
 
 make --jobs=2
 make --quiet test
-if test "$jobname" = "linux-gcc"
-then
+case "$jobname" in
+linux-gcc)
GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
-fi
+   ;;
+GETTEXT_POISON)
+   GIT_GETTEXT_POISON=YesPlease make --quiet test
+   ;;
+esac
 
 check_unignored_build_artifacts
 
-- 
2.19.1.681.g6bd79da3f5



[PATCH 3/8] lib-rebase: loosen GETTEXT_POISON check in fake editor

2018-10-22 Thread SZEDER Gábor
The fake editor script created by 't/lib-rebase.sh' recognizes GETTEXT
POSION output when the first line of the file to be edited consists
solely of the GETTEXT POISON magic string as a comment.  However, a
later patch will include additional text after that magic string, so
that check won't work anymore.

So instead of expecting an exact match in the first line, check
whether there are any lines starting with the commented out magic
string.

Signed-off-by: SZEDER Gábor 
---
 t/lib-rebase.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/lib-rebase.sh b/t/lib-rebase.sh
index 25a77ee5cb..530f8ec0a8 100644
--- a/t/lib-rebase.sh
+++ b/t/lib-rebase.sh
@@ -29,7 +29,7 @@ set_fake_editor () {
*/COMMIT_EDITMSG)
test -z "$EXPECT_HEADER_COUNT" ||
test "$EXPECT_HEADER_COUNT" = "$(sed -n '1s/^# This is 
a combination of \(.*\) commits\./\1/p' < "$1")" ||
-   test "# # GETTEXT POISON #" = "$(sed -n '1p' < "$1")" ||
+   ! grep -q "^# # GETTEXT POISON #" ||
exit
test -z "$FAKE_COMMIT_MESSAGE" || echo "$FAKE_COMMIT_MESSAGE" > 
"$1"
test -z "$FAKE_COMMIT_AMEND" || echo "$FAKE_COMMIT_AMEND" >> 
"$1"
-- 
2.19.1.681.g6bd79da3f5



[PATCH 2/8] gettext: don't poison if GIT_GETTEXT_POISON is set but empty

2018-10-22 Thread SZEDER Gábor
This allows us to run test with non-GETTEXT POISON-ed behavior even in
a GETTEXT POISON build by running:

  GIT_GETTEXT_POISON= ./t1234-foo.sh

Signed-off-by: SZEDER Gábor 
---
 Makefile  | 2 +-
 gettext.c | 9 +++--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index ad880d1fc5..7a165445cd 100644
--- a/Makefile
+++ b/Makefile
@@ -365,7 +365,7 @@ all::
 # Define GETTEXT_POISON if you are debugging the choice of strings marked
 # for translation.  In a GETTEXT_POISON build, you can turn all strings marked
 # for translation into gibberish by setting the GIT_GETTEXT_POISON variable
-# (to any value) in your environment.
+# to a non-empty value in your environment.
 #
 # Define JSMIN to point to JavaScript minifier that functions as
 # a filter to have gitweb.js minified.
diff --git a/gettext.c b/gettext.c
index 7272771c8e..a9509a5df3 100644
--- a/gettext.c
+++ b/gettext.c
@@ -50,8 +50,13 @@ const char *get_preferred_languages(void)
 int use_gettext_poison(void)
 {
static int poison_requested = -1;
-   if (poison_requested == -1)
-   poison_requested = getenv("GIT_GETTEXT_POISON") ? 1 : 0;
+   if (poison_requested == -1) {
+   const char *v = getenv("GIT_GETTEXT_POISON");
+   if (v && *v)
+   poison_requested = 1;
+   else
+   poison_requested = 0;
+   }
return poison_requested;
 }
 #endif
-- 
2.19.1.681.g6bd79da3f5



Re: [PATCH] Poison gettext with the Ook language

2018-10-22 Thread SZEDER Gábor
On Mon, Oct 22, 2018 at 05:36:33PM +0200, Nguyễn Thái Ngọc Duy wrote:
> The current gettext() function just replaces all strings with
> '# GETTEXT POISON #' including format strings and hides the things
> that we should be allowed to grep (like branch names, or some other
> codes) even when gettext is poisoned.
> 
> This patch implements the poisoned _() with a universal and totally
> legit language called Ook [1]. We could actually grep stuff even in
> with this because format strings are preserved.

Once upon a time a GETTEXT_POISON build job failed on me, and the
error message:

  error: # GETTEXT POISON #

was not particularly useful.  Ook wouldn't help with that...

So I came up with the following couple of patches that implement a
"scrambled" format that makes the poisoned output legible for humans
but still gibberish for machine consumption (i.e. grep-ing the text
part would still fail):

  error:  U.n.a.b.l.e. .t.o. .c.r.e.a.t.e. .'./home/szeder/src/git/t/trash 
directory.t1404-update-ref-errors/.git/packed-refs...l.o.c.k.'.:. .File 
exists...

I have been running GETTEXT_POISON builds with this series for some
months now, but haven't submitted it yet, because I haven't decided
yet whether including strings (paths, refs, etc.) in the output as
they are is a feature or a flaw.  And because it embarrassingly leaks
every single translated string... :)


SZEDER Gábor (8):
  test-lib.sh: preserve GIT_GETTEXT_POISON from the environment
  gettext: don't poison if GIT_GETTEXT_POISON is set but empty
  lib-rebase: loosen GETTEXT_POISON check in fake editor
  gettext: #ifdef away GETTEXT POISON-related code from _() and Q_()
  gettext: put "# GETTEXT POISON #" string literal into a macro
  gettext: use an enum for the mode of GETTEXT POISONing
  gettext: introduce GIT_GETTEXT_POISON=scrambled
  travis-ci: run GETTEXT POISON build job in scrambled mode, too

 Makefile  |  2 +-
 ci/lib-travisci.sh|  1 +
 ci/run-build-and-tests.sh | 10 +--
 gettext.c | 63 +++
 gettext.h | 33 +++-
 t/lib-rebase.sh   |  2 +-
 t/test-lib.sh | 17 ++-
 7 files changed, 110 insertions(+), 18 deletions(-)

-- 
2.19.1.681.g6bd79da3f5



[PATCH 1/8] test-lib.sh: preserve GIT_GETTEXT_POISON from the environment

2018-10-22 Thread SZEDER Gábor
Setting GIT_GETTEXT_POISON can be useful when testing translated
builds.

However, preserving its value from the environment is not as simple as
adding it to the list of GIT_* variables that should not be scrubbed
from the environment:

  - GIT_GETTEXT_POISON should not influence git commands executed
during initialization of test-lib and the test repo.

Save its value before it gets scrubbed from the environment, so it
will be unset for the duration of the initialization, and restore
its original value after initialization is finished.

  - When testing a GETTEXT_POISON build, 'test-lib.sh' always sets
GIT_GETTEXT_POISON to 'YesPlease'.  This was fine while all that
mattered was whether it's set or not.  However, the following
patches will introduce meaningful values (e.g. "set but empty" to
disable poisoning even in a GETTEXT_POISON build), so only set it
like this if GIT_GETTEXT_POISON was not set in the environment.

Signed-off-by: SZEDER Gábor 
---
 t/test-lib.sh | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index ea2bbaaa7a..282c05110d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -95,6 +95,16 @@ PAGER=cat
 TZ=UTC
 export LANG LC_ALL PAGER TZ
 EDITOR=:
+
+# GIT_GETTEXT_POISON should not influence git commands executed during
+# initialization of test-lib and the test repo.
+# Back it up, unset and then restore after initialization is finished.
+if test -n "${GIT_GETTEXT_POISON-set}"
+then
+   git_gettext_poison_backup=$GIT_GETTEXT_POISON
+   unset GIT_GETTEXT_POISON
+fi
+
 # A call to "unset" with no arguments causes at least Solaris 10
 # /usr/xpg4/bin/sh and /bin/ksh to bail out.  So keep the unsets
 # deriving from the command substitution clustered with the other
@@ -1073,7 +1083,12 @@ test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 # Can we rely on git's output in the C locale?
 if test -n "$GETTEXT_POISON"
 then
-   GIT_GETTEXT_POISON=YesPlease
+   if test -n "${git_gettext_poison_backup-set}"
+   then
+   GIT_GETTEXT_POISON=$git_gettext_poison_backup
+   else
+   GIT_GETTEXT_POISON=YesPlease
+   fi
export GIT_GETTEXT_POISON
test_set_prereq GETTEXT_POISON
 else
-- 
2.19.1.681.g6bd79da3f5



Re: [PATCH] completion: fix __gitcomp_builtin no longer consider extra options

2018-10-22 Thread SZEDER Gábor
On Mon, Oct 22, 2018 at 04:34:16PM +0200, Duy Nguyen wrote:
> On Mon, Oct 22, 2018 at 5:51 AM Junio C Hamano  wrote:
> >
> > Nguyễn Thái Ngọc Duy   writes:
> >
> > > __gitcomp_builtin() has the main completion list provided by
> > >
> > > git xxx --git-completion-helper
> > >
> > > but the caller can also add extra options that is not provided by
> > > --git-completion-helper. The only call site that does this is "git
> > > difftool" completion.
> > >
> > > This support is broken by b221b5ab9b (completion: collapse extra
> > > --no-.. options - 2018-06-06), which adds a special value "--" to mark
> > > that the rest of the options can be hidden by default. The commit
> > > forgets the fact that extra options are appended after
> > > "$(git xxx --git-completion-helper)", i.e. after this "--", and will
> > > be incorrectly hidden as well.
> > >
> > > Prepend the extra options before "$(git xxx --git-completion-helper)"
> > > to avoid this.
> >
> > Thanks for a clear analysis.  How did you find it?  Got annoyed that
> > completion of difftool got broken, or discovered while trying to
> > apply the same trick as difftool completion uses to another one and
> > seeing that the technique does not work?
> 
> I was fixing format-patch completion and was surprised it did not work
> as expected. Never really used difftool myself :P I only found out
> about it when I asked myself "why wasn't this breakage found earlier?"

Erm...  maybe because there are no tests covering the combination of
extra options and '--no-..' options in 't9902-completion.sh'?

(Hint, hint... :)



New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]

2018-10-22 Thread SZEDER Gábor
On Tue, Oct 16, 2018 at 04:35:31PM -0700, Stefan Beller wrote:
> the last patch (applying the semantic patches) has been omitted as that
> would produce a lot of merge conflicts. Without that patch, this merges
> cleanly to next.
> 
> As for when to apply the semantic patches, I wondered if we would prefer a 
> dirty merge
> (created via "make coccicheck && git apply 
> contrib/coccinelle/the_repository.cocci.patch")
> of the semantic patches, or if we'd actually trickle in the changes over some 
> time?

> This series takes another approach as it doesn't change the signature of
> functions, but introduces new functions that can deal with arbitrary 
> repositories, keeping the old function signature around using a shallow 
> wrapper.
> 
> Additionally each patch adds a semantic patch, that would port from the 
> old to
> the new function. These semantic patches are all applied in the very last 
> patch,
> but we could omit applying the last patch if it causes too many merge 
> conflicts
> and trickl in the semantic patches over time when there are no merge 
> conflicts.

I don't really like how this or the previous RFC patch series deal
with semantic patches (or how some past patch series dealt with them,
for that matter), for various reasons:

  - Applying the transformations from several semantic patches in one
single patch makes it harder to review it, because we won't know
which change came from which semantic patch.

For comparison, see the patch series adding hasheq()/oideq(),
merged in 769af0fd9e (Merge branch 'jk/cocci', 2018-09-17), in
particular the four "convert  to " patches.

  - 'make coccicheck' won't run clean (and the static analysis build
job on Travis CI will fail) for all commits following adding the
new semantic patches but before applying the resulting
transformations.

  - These semantic patches interact badly with 'pu' and 'next',
because those integration branches can contain topic branches
adding new code that should be transformed by these semanic
patches.  Consequently, 'make coccicheck' won't run clean and the
static analysis build job will fail until all those topics reach
'master', and the remaining transformations are applied on top.

This was (and still is!) an issue with the hasheq()/oideq() series
as well: that series was added on 2018-08-28, and the static
analysis build job is red on 'pu' ever since.  See the follow-up
patch e43d2dcce1 (more oideq/hasheq conversions, 2018-10-02), and
one more follow-up will be necessary after the builtin stash topic
is merged to 'master'.

This makes it harder to review other patch series.

  - Is it really necessary to carry these semantic patches in-tree?
Let me ellaborate.  There are basically two main use cases for
semantic patches:

  - To avoid undesirable code patterns, e.g. we should not use
sha1_to_hex(oid.hash) or strbuf_addf(, "fixed string"), but
use oid_to_hex() or strbuf_addstr(, "fixed string")
instead.  Note that in these cases we don't remove the
functions sha1_to_hex() or strbuf_addf(), because there are
good reasons to use them in other scenarios.

Our semantic patches under 'contrib/coccinelle/' fall into
this category, and we have 'make coccicheck' and the static
analysis build job on Travis CI to catch these undesirable
code patterns preferably early, and to prevent them from
entering our codebase.

  - To perform one-off code transformations, e.g. to modify a
function's name and/or signature and convert all its
callsites; see e.g. commits abef9020e3 (sha1_file: convert
sha1_object_info* to object_id, 2018-03-12) and b4f5aca40e
(sha1_file: convert read_sha1_file to struct object_id,
2018-03-12).

As far as I understand this patch series falls into this
category: once the conversion is complete the old functions
will be removed.  After that there will be no use for these
semanic patches.

Having said that, it's certainly easier to double-check the
resulting transformations when one can apply the semantic
patches locally, and doing so is easier when the semantic
patches are in tree than when they must be copy-pasted from a
commit message.

OK, that was already long.  Now, can we do better?

How about introducing the concept of "pending" semantic patches,
stored in 'contrib/coccinelle/.pending.cocci' files, modifying
'make coccicheck' to skip them, and adding the new 'make
coccicheck-pending' target to make it convenient to apply them, e.g.
something like the simple patch at the end.

So the process would go something like this:

  - A new semantic patch should be added as "pending", e.g. to the
file 'the_repository.pending.cocci', together with the resulting
transformations in the same 

Re: [PATCH v3 2/8] Add a place for (not) sharing stuff between worktrees

2018-10-22 Thread SZEDER Gábor
On Sun, Oct 21, 2018 at 10:08:53AM +0200, Nguyễn Thái Ngọc Duy wrote:
> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
> index e2ee9fc21b..a50fbf8094 100644
> --- a/Documentation/git-worktree.txt
> +++ b/Documentation/git-worktree.txt
> @@ -204,6 +204,22 @@ working trees, it can be used to identify worktrees. For 
> example if
>  you only have two working trees, at "/abc/def/ghi" and "/abc/def/ggg",
>  then "ghi" or "def/ghi" is enough to point to the former working tree.
>  
> +REFS
> +
> +In multiple working trees, some refs may be shared between all working
> +trees, some refs are local. One example is HEAD is different for all
> +working trees. This section is about the sharing rules.
> +
> +In general, all pseudo refs are per working tree and all refs starting
> +with "refs/" are shared. Pseudo refs are ones like HEAD which are
> +directly under GIT_DIR instead of inside GIT_DIR/refs. There are one
> +exception to this: refs inside refs/bisect and refs/worktree is not
> +shared.
> +
> +To access refs, it's best not to look inside GIT_DIR directly. Instead
> +use commands such as linkgit:git-revparse[1] or linkgit:git-update-ref[1]

s/revparse/rev-parse/

> +which will handle refs correctly.
> +


Re: [PATCH] completion: use __gitcomp_builtin for format-patch

2018-10-22 Thread SZEDER Gábor
On Sun, Oct 21, 2018 at 10:41:02AM +0200, Nguyễn Thái Ngọc Duy wrote:
> This helps format-patch gain completion for a couple new options,
> notably --range-diff.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  Of course it will be even better if I could complete the ref for
>  --range-diff=, but maybe another day.
> 
>  contrib/completion/git-completion.bash | 10 +++---
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index c8fdcf8644..065b922777 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -1533,12 +1533,8 @@ _git_fetch ()
>  }
>  
>  __git_format_patch_options="
> - --stdout --attach --no-attach --thread --thread= --no-thread
> - --numbered --start-number --numbered-files --keep-subject --signoff
> - --signature --no-signature --in-reply-to= --cc= --full-index --binary
> - --not --all --cover-letter --no-prefix --src-prefix= --dst-prefix=
> - --inline --suffix= --ignore-if-in-upstream --subject-prefix=
> - --output-directory --reroll-count --to= --quiet --notes
> + --full-index --not --all --no-prefix --src-prefix=
> + --dst-prefix= --notes
>  "

$__git_format_patch_options is also used when completing 'git
send-email's options, thus removing all these options will badly
affect that, and in fact makes 't9902-completion.sh' fail with:

  expecting success:
  test_completion "git send-email --cov" "--cover-letter " &&
  test_completion "git send-email ma" "master "
  
  --- expected2018-10-22 09:03:42.255325418 +
  +++ out_sorted  2018-10-22 09:03:42.255325418 +
  @@ -1 +1 @@
  ---cover-letter
  +
  not ok 122 - send-email
  #
  #   test_completion "git send-email --cov" "--cover-letter " &&
  #   test_completion "git send-email ma" "master "
  #


>  _git_format_patch ()
> @@ -1551,7 +1547,7 @@ _git_format_patch ()
>   return
>   ;;
>   --*)
> - __gitcomp "$__git_format_patch_options"
> + __gitcomp_builtin format-patch "$__git_format_patch_options"

You sent out a separate bugfix patch (i.e. not a patch series)
"completion: fix __gitcomp_builtin no longer consider extra options"
in:

  https://public-inbox.org/git/20181021083731.8009-1-pclo...@gmail.com/

and this patch makes the completion of 'git format-patch' also use
extra options...  So, does this patch really work without that bugfix?

>   return
>   ;;
>   esac
> -- 
> 2.19.1.647.g708186aaf9
> 


Re: [PATCH v3 10/12] Add a base implementation of SHA-256 support

2018-10-22 Thread SZEDER Gábor
On Mon, Oct 22, 2018 at 02:43:40AM +, brian m. carlson wrote:

> diff --git a/sha256/block/sha256.c b/sha256/block/sha256.c
> new file mode 100644
> index 00..683bc6e39b
> --- /dev/null
> +++ b/sha256/block/sha256.c
> @@ -0,0 +1,186 @@
> +#include "git-compat-util.h"
> +#include "./sha256.h"
> +
> +#define BLKSIZE blk_SHA256_BLKSIZE

[...]

> +#define Ch(x,y,z)   (z ^ (x & (y ^ z)))
> +#define Maj(x,y,z)  (((x | y) & z) | (x & y))
> +#define S(x, n) ror((x),(n))
> +#define R(x, n) ((x)>>(n))
> +#define Sigma0(x)   (S(x, 2) ^ S(x, 13) ^ S(x, 22))
> +#define Sigma1(x)   (S(x, 6) ^ S(x, 11) ^ S(x, 25))
> +#define Gamma0(x)   (S(x, 7) ^ S(x, 18) ^ R(x, 3))
> +#define Gamma1(x)   (S(x, 17) ^ S(x, 19) ^ R(x, 10))

[...]

> +#define RND(a,b,c,d,e,f,g,h,i,ki)\
> + t0 = h + Sigma1(e) + Ch(e, f, g) + ki + W[i];   \
> + t1 = Sigma0(a) + Maj(a, b, c);  \
> + d += t0;\
> + h  = t0 + t1;

[...]

> +#undef RND
> +#undef Ch
> +#undef Maj
> +#undef S
> +#undef R
> +#undef Sigma0
> +#undef Sigma1
> +#undef Gamma0
> +#undef Gamma1

To protect us from potential "macro redefinition" errors, these
#undefs should come before the #defines above.  Note also that BLKSIZE
is not #undef-ed.



Re: [PATCH v2 07/13] tests: introduce `test_atexit`

2018-10-21 Thread SZEDER Gábor
On Mon, Oct 15, 2018 at 03:12:08AM -0700, Johannes Schindelin via GitGitGadget 
wrote:
> From: Johannes Schindelin 
> 
> When running the p4 daemon or `git daemon`, we want to kill it at the
> end of the test script.
> 
> So far, we do this "manually".
> 
> However, in the next few commits we want to teach the test suite to
> optionally re-run scripts with different options, therefore we will have
> to have a consistent way to stop daemons.
> 
> Let's introduce `test_atexit`, which is loosely modeled after
> `test_when_finished` (but has a broader scope: rather than running the
> commands after the current test case, run them when the test script
> finishes, and also run them when the `--immediate` option is in effect).

I think killing daemons on failure even with '--immediate' will make
life a bit easier.  I remember several occasions when I tried to debug
a failing test with '--immediate', some daemon stayed alive, and then
the next time I run the test it failed during setup.

Besides 'git daemon' and P4, the httpd tests could benefit from this
as well.


> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index 78d8c3783b..d7dd0c1be9 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -891,6 +891,35 @@ test_when_finished () {
>   } && (exit \"\$eval_ret\"); eval_ret=\$?; $test_cleanup"
>  }
>  
> +# This function can be used to schedule some commands to be run
> +# unconditionally at the end of the test script, e.g. to stop a daemon:
> +#
> +#test_expect_success 'test git daemon' '
> +#git daemon &
> +#daemon_pid=$! &&
> +#test_atexit "kill $daemon_pid" &&
> +#hello world
> +#'

I think we should add something like this as well:

# Note that these commands will be run even when a test script run
# with '--immediate' fails.  Be careful with your commands to minimize
# any changes to the failed state.

> +
> +test_atexit () {
> + # We cannot detect when we are in a subshell in general, but by
> + # doing so on Bash is better than nothing (the test will
> + # silently pass on other shells).
> + test "${BASH_SUBSHELL-0}" = 0 ||
> + error "bug in test script: test_atexit does nothing in a subshell"
> + test_atexit_cleanup="{ $*
> + } && (exit \"\$eval_ret\"); eval_ret=\$?; $test_atexit_cleanup"
> +}
> +
> +test_atexit_handler () {
> + test : != "$test_atexit_cleanup" || return 0
> +
> + setup_malloc_check
> + test_eval_ "$test_atexit_cleanup"
> + test_atexit_cleanup=:
> + teardown_malloc_check
> +}

The file 'test-lib-functions.sh' contains helper functions to be used
in tests.  'test_atexit_handler' is not such a function, so I think it
would be better to add it to 'test-lib.sh'.

> +
>  # Most tests can use the created repository, but some may need to create 
> more.
>  # Usage: test_create_repo 
>  test_create_repo () {
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 7ed0013f6d..6f9c1f5300 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -413,6 +413,7 @@ test_external_has_tap=0
>  
>  die () {
>   code=$?

# Run atexit handlers even when a test script run with
# '--immediate' fails.

> + test_atexit_handler || code=$?
>   if test -n "$GIT_EXIT_OK"
>   then
>   exit $code
> @@ -826,9 +827,12 @@ write_junit_xml_testcase () {
>   junit_have_testcase=t
>  }
>  
> +test_atexit_cleanup=:
>  test_done () {
>   GIT_EXIT_OK=t
>  
> + test -n "$immediate" || test_atexit_handler

You mentioned elsewhere [1] that you'll remove the condition to run
'test_atexit_handler' unconditionally.  I think this is the right
thing to do, because this condition is responsible for the hanging P4
tests I reported in [2].

So, 'lib-git-p4.sh' stores the pid of the 'p4d' process in the file
"$TRASH_DIRECTORY/p4d.pid" in 'start_p4d', to be read later in
'kill_p4d' to know which process to kill.  When a test script run with
'--immediate' ends successfully, and the atexit handlers are run in
'die' above instead of in 'test_done', then you've got the problem
that 'test_done' removes the trash directory along with the 'p4d.pid'
file before the atexit handlers are run.  With the pidfile gone
'kill_p4d' won't know which process to kill, and the 'p4d' process
will stay alive.

If that condition is removed, then the atexit handlers will be run
before the trash directory is removed, so 'p4d' processes will be
stopped as they should even when an '--immediate' run succeeds.  Good.
Note, however, that in this case 'test_atexit_handler' will be called
twice: first in 'test_done' and then in 'die' as well.  This should be
fine, because 'test_atexit_handler' resets 'test_atexit_cleanup=:', so
the second call will do nothing.  Great.  Perhaps this would be worth
pointing out explicitly in the commit message and/or a comment.

[1] 
https://public-inbox.org/git/nycvar.qro.7.76.6.181016240.4...@tvgsbejvaqbjf.bet/
[2] 

Re: What's cooking in git.git (Oct 2018, #04; Fri, 19)

2018-10-19 Thread SZEDER Gábor
On Fri, Oct 19, 2018 at 03:02:22PM +0900, Junio C Hamano wrote:
> Two large set of topics on "rebase in C" and "rebase -i in C" are
> now in 'next'.

I see occasional failures in 't5520-pull.sh':

expecting success: 
test_config rebase.autostash true &&
test_pull_autostash --rebase

+ test_config rebase.autostash true
+ config_dir=
+ test rebase.autostash = -C
+ test_when_finished test_unconfig  'rebase.autostash'
+ test 0 = 0
+ test_cleanup={ test_unconfig  'rebase.autostash'
} && (exit "$eval_ret"); eval_ret=$?; :
+ git config rebase.autostash true
+ test_pull_autostash --rebase
+ git reset --hard before-rebase
HEAD is now at 12212b3 new file
+ echo dirty
+ git add new_file
+ git pull --rebase . copy
>From .
 * branchcopy   -> FETCH_HEAD
Created autostash: 5417697
HEAD is now at 12212b3 new file
First, rewinding head to replay your work on top of it...
Applying: new file
Applying autostash resulted in conflicts.
Your changes are safe in the stash.
You can run "git stash pop" or "git stash drop" at any time.
+ test_cmp_rev HEAD^ copy
+ git rev-parse --verify HEAD^
+ git rev-parse --verify copy
+ test_cmp expect.rev actual.rev
+ diff -u expect.rev actual.rev
+ cat new_file
cat: new_file: No such file or directory
+ test  = dirty
error: last command exited with $?=1
not ok 25 - pull --rebase succeeds with dirty working directory and 
rebase.autostash set

When running t5520 in a loop, it tends to fail between 10-40
iterations, even when the machine is not under heavy load.

It appears that these failures started with commit 5541bd5b8f (rebase:
default to using the builtin rebase, 2018-08-08), i.e. tip of
'pk/rebase-in-c-6-final', but it's a "flip the big switch" commit, so
not very useful...



Re: [PATCH 07/15] sequencer: make sequencer_make_script() write its script to a strbuf

2018-10-19 Thread SZEDER Gábor
On Fri, Oct 19, 2018 at 05:16:46PM +0900, Junio C Hamano wrote:
> SZEDER Gábor  writes:
> 
> >>if (entry)
> >> -  fprintf(out, "\n%c Branch %s\n", comment_line_char, 
> >> entry->string);
> >> +  strbuf_addf(out, "\n%c Branch %s\n", comment_line_char, 
> >> entry->string);
> >>else
> >> -  fprintf(out, "\n");
> >> +  strbuf_addf(out, "\n");
> >
> > Please use plain strbuf_add() here.
> 
> FWIW, contrib/coccinelle/strbuf.cocci.patch gave us this:
> 
> diff -u -p a/sequencer.c b/sequencer.c
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -4311,7 +4311,7 @@ static int make_script_with_merges(struc
>   if (entry)
>   strbuf_addf(out, "\n%c Branch %s\n", comment_line_char, 
> entry->string);
>   else
> - strbuf_addf(out, "\n");
> + strbuf_addstr(out, "\n");
>  
>   while (oidset_contains(, >object.oid) &&
>  !oidset_contains(, >object.oid)) {

Uh, right.  I didn't want to copy-paste a patch with too long lines
into my mailer, as it usually doesn't end too good, so I just typed
the function name.  Evidently I couldn't quite manage.



Re: [PATCH v2 02/13] ci/lib.sh: encapsulate Travis-specific things

2018-10-19 Thread SZEDER Gábor
On Fri, Oct 19, 2018 at 11:06:25AM +0900, Junio C Hamano wrote:
> SZEDER Gábor  writes:
> 
> > On Mon, Oct 15, 2018 at 03:12:00AM -0700, Johannes Schindelin via 
> > GitGitGadget wrote:
> >> diff --git a/ci/lib.sh b/ci/lib.sh
> >> index 06970f7213..8532555b4e 100755
> >> --- a/ci/lib.sh
> >> +++ b/ci/lib.sh
> >> @@ -1,5 +1,26 @@
> >>  # Library of functions shared by all CI scripts
> >>  
> >> +if test true = "$TRAVIS"
> >> +then
> >> +...
> >> +  export GIT_PROVE_OPTS="--timer --jobs 3 --state=failed,slow,save"
> >> +  export GIT_TEST_OPTS="--verbose-log -x --immediate"
> >> +fi
> >
> > Please set all these variables ...
> 
> Do you mean "VAR=VAL; export VAR" is kosher, "export VAR=VAL" is
> not?
> 
> >> @@ -81,7 +102,6 @@ check_unignored_build_artifacts ()
> >>  # and installing dependencies.
> >>  set -ex
> >
> > ... after we turn on 'set -x', so the variables' values will be
> > visible in the logs.
> 
> Ah, no, you didn't.  Although I think both are valid points, I think
> ci/lib.sh is expected to be used only inside a more predictable
> environment (e.g. we know the shell used is not a random POSIX shell
> but one that is happy with "export VAR=VAL"), so it should be OK.

Yes.  Travis CI runs an Ubuntu LTS, where /bin/sh is dash, which
understands 'export VAR=val' just fine.  I don't know what Linux
distro runs on Azure Pipelines, but the build definition in patch 6
explicitly asks for Bash, so that should be fine as well.

> Showing the values of these variables in the log may still be good
> idea.
> 
> > (Or move this 'set -ex' to the beginning of the script?  Then we
> > could perhaps avoid similar issues in the future.)
> 
> Sure (provided that it is an issue to begin with---if we are
> interested in the value of TRAVIS_BRANCH, for example, being able to
> see it only because "CI_BRANCH=$TRAVIS_BRANCH" assignment is made
> feels a bit sideways---we'd be better off explicitly logging
> anything we are interested in in the longer term, no?).

Well, all the $TRAVIS_* and $CI_* variables are not that interessant,
because they are only used in the build scripts, and then we can see
their substituted values in the build logs.  The same applies to
the variables $cache_dir and $BREW_INSTALL_PACKAGES as well.

$GIT_PROVE_OPTS and $GIT_TEST_OPTS, however, are only used in
't/Makefile' but not in the build scripts, thus their values don't
show up in the build logs.

I run some Travis CI builds with custom $GIT_TEST_OPTS, which, of
course, conflicted with this patch, and I messed up the conflict
resolution.  I think I would have noticed sooner what went wrong, if
the value of $GIT_TEST_OPTS were visible in the build logs.

I've found the output with 'set -x' sufficient so far, and don't think
that an explicit logging facility is worth it.



Re: [PATCH v2 02/13] ci/lib.sh: encapsulate Travis-specific things

2018-10-18 Thread SZEDER Gábor
On Mon, Oct 15, 2018 at 03:12:00AM -0700, Johannes Schindelin via GitGitGadget 
wrote:
> diff --git a/ci/lib.sh b/ci/lib.sh
> index 06970f7213..8532555b4e 100755
> --- a/ci/lib.sh
> +++ b/ci/lib.sh
> @@ -1,5 +1,26 @@
>  # Library of functions shared by all CI scripts
>  
> +if test true = "$TRAVIS"
> +then
> + # We are running within Travis CI
> + CI_BRANCH="$TRAVIS_BRANCH"
> + CI_COMMIT="$TRAVIS_COMMIT"
> + CI_JOB_ID="$TRAVIS_JOB_ID"
> + CI_JOB_NUMBER="$TRAVIS_JOB_NUMBER"
> + CI_OS_NAME="$TRAVIS_OS_NAME"
> + CI_REPO_SLUG="$TRAVIS_REPO_SLUG"
> +
> + cache_dir="$HOME/travis-cache"
> +
> + url_for_job_id () {
> + echo "https://travis-ci.org/$CI_REPO_SLUG/jobs/$1;
> + }
> +
> + BREW_INSTALL_PACKAGES="git-lfs gettext"
> + export GIT_PROVE_OPTS="--timer --jobs 3 --state=failed,slow,save"
> + export GIT_TEST_OPTS="--verbose-log -x --immediate"
> +fi

Please set all these variables ...

> +
>  skip_branch_tip_with_tag () {
>   # Sometimes, a branch is pushed at the same time the tag that points
>   # at the same commit as the tip of the branch is pushed, and building
> @@ -13,10 +34,10 @@ skip_branch_tip_with_tag () {
>   # we can skip the build because we won't be skipping a build
>   # of a tag.
>  
> - if TAG=$(git describe --exact-match "$TRAVIS_BRANCH" 2>/dev/null) &&
> - test "$TAG" != "$TRAVIS_BRANCH"
> + if TAG=$(git describe --exact-match "$CI_BRANCH" 2>/dev/null) &&
> + test "$TAG" != "$CI_BRANCH"
>   then
> - echo "$(tput setaf 2)Tip of $TRAVIS_BRANCH is exactly at 
> $TAG$(tput sgr0)"
> + echo "$(tput setaf 2)Tip of $CI_BRANCH is exactly at $TAG$(tput 
> sgr0)"
>   exit 0
>   fi
>  }
> @@ -25,7 +46,7 @@ skip_branch_tip_with_tag () {
>  # job if we encounter the same tree again and can provide a useful info
>  # message.
>  save_good_tree () {
> - echo "$(git rev-parse $TRAVIS_COMMIT^{tree}) $TRAVIS_COMMIT 
> $TRAVIS_JOB_NUMBER $TRAVIS_JOB_ID" >>"$good_trees_file"
> + echo "$(git rev-parse $CI_COMMIT^{tree}) $CI_COMMIT $CI_JOB_NUMBER 
> $CI_JOB_ID" >>"$good_trees_file"
>   # limit the file size
>   tail -1000 "$good_trees_file" >"$good_trees_file".tmp
>   mv "$good_trees_file".tmp "$good_trees_file"
> @@ -35,7 +56,7 @@ save_good_tree () {
>  # successfully before (e.g. because the branch got rebased, changing only
>  # the commit messages).
>  skip_good_tree () {
> - if ! good_tree_info="$(grep "^$(git rev-parse $TRAVIS_COMMIT^{tree}) " 
> "$good_trees_file")"
> + if ! good_tree_info="$(grep "^$(git rev-parse $CI_COMMIT^{tree}) " 
> "$good_trees_file")"
>   then
>   # Haven't seen this tree yet, or no cached good trees file yet.
>   # Continue the build job.
> @@ -45,18 +66,18 @@ skip_good_tree () {
>   echo "$good_tree_info" | {
>   read tree prev_good_commit prev_good_job_number prev_good_job_id
>  
> - if test "$TRAVIS_JOB_ID" = "$prev_good_job_id"
> + if test "$CI_JOB_ID" = "$prev_good_job_id"
>   then
>   cat <<-EOF
> - $(tput setaf 2)Skipping build job for commit 
> $TRAVIS_COMMIT.$(tput sgr0)
> + $(tput setaf 2)Skipping build job for commit 
> $CI_COMMIT.$(tput sgr0)
>   This commit has already been built and tested 
> successfully by this build job.
>   To force a re-build delete the branch's cache and then 
> hit 'Restart job'.
>   EOF
>   else
>   cat <<-EOF
> - $(tput setaf 2)Skipping build job for commit 
> $TRAVIS_COMMIT.$(tput sgr0)
> + $(tput setaf 2)Skipping build job for commit 
> $CI_COMMIT.$(tput sgr0)
>   This commit's tree has already been built and tested 
> successfully in build job $prev_good_job_number for commit $prev_good_commit.
> - The log of that build job is available at 
> https://travis-ci.org/$TRAVIS_REPO_SLUG/jobs/$prev_good_job_id
> + The log of that build job is available at 
> $(url_for_job_id $prev_good_job_id)
>   To force a re-build delete the branch's cache and then 
> hit 'Restart job'.
>   EOF
>   fi
> @@ -81,7 +102,6 @@ check_unignored_build_artifacts ()
>  # and installing dependencies.
>  set -ex

... after we turn on 'set -x', so the variables' values will be
visible in the logs.

(Or move this 'set -ex' to the beginning of the script?  Then we
could perhaps avoid similar issues in the future.)

> -cache_dir="$HOME/travis-cache"
>  good_trees_file="$cache_dir/good-trees"
>  
>  mkdir -p "$cache_dir"


Re: [PATCH v2 09/13] git-p4: use `test_atexit` to kill the daemon

2018-10-17 Thread SZEDER Gábor
On Mon, Oct 15, 2018 at 03:12:11AM -0700, Johannes Schindelin via GitGitGadget 
wrote:
> From: Johannes Schindelin 
> 
> This should be more reliable than the current method, and prepares the
> test suite for a consistent way to clean up before re-running the tests
> with different options.

This patch makes the test suite hang in all four Travis CI build jobs
with P4 installed without any of the P4 tests finishing.

Reverting this patch from the whole patch series makes it work again.

I've also tried to revert only this first hunk of the patch below,
because based on the comment I thought it's worth a try, but it didn't
really help.  It did make a difference: the 300s watchdog timer
eventually kicked in, and then the test scripts could finish
successfully...  but there are a lot of P4 test scripts, and with each
taking 300s the build job still timeouted.

All this may (or may not) be related to and be a different symptom of
the leftover p4d processes Luke mentioned.  I couldn't reproduce any
of this on my machine so far.

> diff --git a/t/lib-git-p4.sh b/t/lib-git-p4.sh
> index c27599474c..f4f5d7d296 100644
> --- a/t/lib-git-p4.sh
> +++ b/t/lib-git-p4.sh
> @@ -74,15 +74,6 @@ cli="$TRASH_DIRECTORY/cli"
>  git="$TRASH_DIRECTORY/git"
>  pidfile="$TRASH_DIRECTORY/p4d.pid"
>  
> -# Sometimes "prove" seems to hang on exit because p4d is still running
> -cleanup () {
> - if test -f "$pidfile"
> - then
> - kill -9 $(cat "$pidfile") 2>/dev/null && exit 255
> - fi
> -}
> -trap cleanup EXIT
> -
>  # git p4 submit generates a temp file, which will
>  # not get cleaned up if the submission fails.  Don't
>  # clutter up /tmp on the test machine.


Re: [PATCH v2 10/13] Add a base implementation of SHA-256 support

2018-10-17 Thread SZEDER Gábor
On Mon, Oct 15, 2018 at 02:18:57AM +, brian m. carlson wrote:
> diff --git a/sha256/block/sha256.c b/sha256/block/sha256.c
> new file mode 100644
> index 00..18350c161a
> --- /dev/null
> +++ b/sha256/block/sha256.c
> @@ -0,0 +1,180 @@
> +#include "git-compat-util.h"
> +#include "./sha256.h"
> +
> +#define BLKSIZE blk_SHA256_BLKSIZE
> +
> +void blk_SHA256_Init(blk_SHA256_CTX *ctx)
> +{
> + ctx->offset = 0;
> + ctx->length = 0;
> + ctx->state[0] = 0x6A09E667UL;
> + ctx->state[1] = 0xBB67AE85UL;
> + ctx->state[2] = 0x3C6EF372UL;
> + ctx->state[3] = 0xA54FF53AUL;
> + ctx->state[4] = 0x510E527FUL;
> + ctx->state[5] = 0x9B05688CUL;
> + ctx->state[6] = 0x1F83D9ABUL;
> + ctx->state[7] = 0x5BE0CD19UL;
> +}
> +
> +static inline uint32_t ror(uint32_t x, unsigned n)
> +{
> + return (x >> n) | (x << (32 - n));
> +}
> +
> +#define Ch(x,y,z)   (z ^ (x & (y ^ z)))
> +#define Maj(x,y,z)  (((x | y) & z) | (x & y))
> +#define S(x, n) ror((x),(n))
> +#define R(x, n) ((x)>>(n))
> +#define Sigma0(x)   (S(x, 2) ^ S(x, 13) ^ S(x, 22))
> +#define Sigma1(x)   (S(x, 6) ^ S(x, 11) ^ S(x, 25))
> +#define Gamma0(x)   (S(x, 7) ^ S(x, 18) ^ R(x, 3))
> +#define Gamma1(x)   (S(x, 17) ^ S(x, 19) ^ R(x, 10))

[...]

> +#define RND(a,b,c,d,e,f,g,h,i,ki)\
> + t0 = h + Sigma1(e) + Ch(e, f, g) + ki + W[i];   \
> + t1 = Sigma0(a) + Maj(a, b, c);  \
> + d += t0;\
> + h  = t0 + t1;
> +
> + RND(S[0],S[1],S[2],S[3],S[4],S[5],S[6],S[7],0,0x428a2f98);

[...]

> +#undef RND
> +
> + for (i = 0; i < 8; i++) {
> + ctx->state[i] = ctx->state[i] + S[i];
> + }
> +}
> +
> +#define MIN(x, y) ((x) < (y) ? (x) : (y))

On macOS there is a MIN macro already defined in the system headers,
resulting in the following error:

  CC sha256/block/sha256.o
  sha256/block/sha256.c:133:9: error: 'MIN' macro redefined 
[-Werror,-Wmacro-redefined]
  #define MIN(x, y) ((x) < (y) ? (x) : (y))
  ^
  /usr/include/sys/param.h:215:9: note: previous definition is here
  #define MIN(a,b) (((a)<(b))?(a):(b))
  ^
  1 error generated.
  make: *** [sha256/block/sha256.o] Error 1

A simple "#undef MIN" solves this issue.  However, I wonder whether we
should #undef the other #define directives as well, just to be sure
(and perhaps overly cautious).

> +void blk_SHA256_Update(blk_SHA256_CTX *ctx, const void *data, size_t len)
> +{
> + const unsigned char *in = data;
> + size_t n;
> + ctx->length += len;
> + while (len > 0) {
> + if (!ctx->offset && len >= BLKSIZE) {
> + blk_SHA256_Transform(ctx, in);
> + in += BLKSIZE;
> + len -= BLKSIZE;
> + } else {
> + n = MIN(len, (BLKSIZE - ctx->offset));
> + memcpy(ctx->buf + ctx->offset, in, n);
> + ctx->offset += n;
> + in += n;
> + len -= n;
> + if (ctx->offset == BLKSIZE) {
> + blk_SHA256_Transform(ctx, ctx->buf);
> + ctx->offset = 0;
> + }
> + }
> + }
> +}
> +
> +void blk_SHA256_Final(unsigned char *digest, blk_SHA256_CTX *ctx)
> +{
> + const unsigned trip = BLKSIZE - sizeof(ctx->length);
> + int i;
> +
> + ctx->length <<= 3;
> + ctx->buf[ctx->offset++] = 0x80;
> +
> + if (ctx->offset > trip) {
> + memset(ctx->buf + ctx->offset, 0, BLKSIZE - ctx->offset);
> + blk_SHA256_Transform(ctx, ctx->buf);
> + ctx->offset = 0;
> + }
> +
> + memset(ctx->buf + ctx->offset, 0, BLKSIZE - ctx->offset - 
> sizeof(ctx->length));
> +
> + put_be64(ctx->buf + trip, ctx->length);

Some GCC versions (e.g. gcc-4.8 with -O2 -Wall -Werror) complain about
the above line:

  CC sha256/block/sha256.o
  sha256/block/sha256.c: In function ‘blk_SHA256_Final’:
  sha256/block/sha256.c:174:2: error: dereferencing type-punned pointer will 
break strict-aliasing rules [-Werror=strict-aliasing]
put_be64(ctx->buf + trip, ctx->length);
^
  cc1: all warnings being treated as errors
  make: *** [sha256/block/sha256.o] Error 1

Something like this makes it compile:

  void *ptr = ctx->buf + trip;
  put_be64(ptr, ctx->length);

However, it's not immediately obvious to me why the compiler
complains, or why that intermediate void* variable makes any
difference, but now it's not the time to put on my language lawyer
hat.

Perhaps an old compiler bug?  Clang in general, newer GCC versions, or
gcc-4.8 with -Wall -Werror but without -O2 don't seem to be affected.


> + blk_SHA256_Transform(ctx, ctx->buf);
> +
> + /* copy output */
> + for (i = 0; i < 8; i++, digest += sizeof(uint32_t))
> + put_be32(digest, ctx->state[i]);
> +}


Re: On overriding make variables from the environment...

2018-10-17 Thread SZEDER Gábor
On Tue, Oct 16, 2018 at 03:40:01PM -0700, Jonathan Nieder wrote:
> SZEDER Gábor wrote:
> > On Tue, Oct 16, 2018 at 02:54:56PM -0700, Jonathan Nieder wrote:
> >> SZEDER Gábor wrote:
> 
> >>> Our Makefile has lines like these:
> >>>
> >>>   CFLAGS = -g -O2 -Wall
> >>>   CC = cc
> >>>   AR = ar
> >>>   SPATCH = spatch
> [...]
> >>> I'm not sure what to do.  I'm fine with updating our 'ci/' scripts to
> >>> explicitly respect CC in the environment (either by running 'make
> >>> CC=$CC' or by writing $CC into 'config.mak').  Or I could update our
> >>> Makefile to use '?=' for specific variables, but:
> >>
> >> That's a good question.  I don't have a strong opinion myself, so I
> >> tend to trust larger projects like Linux to have thought this through
> >> more, and they use 'CC = cc' as well.
> >
> > I don't think Linux is a good example to follow in this case, see e.g.
> > 6d62c983f7 (Makefile: Change the default compiler from "gcc" to "cc",
> > 2011-12-20) (in short: Git is supposed to be buildable with compilers
> > other than GCC as well, while Linux not really, so they can hardcode
> > 'CC = gcc').
> 
> Nowadays Linux builds with clang as well.  People also have other
> reasons to override the CC setting (cross-compiling, etc) and to
> override other settings like CFLAGS.
> 
> > Also, the projects I have on hand usually have 'CC = gcc' hardcoded in
> > their Makefiles, too, but those Makefiles were generated by their
> > ./configure script (which in turn by ./autogen.sh...), and those tend
> > to write CC from the environment into the generated Makefiles.
> 
> Yes, I think that's what makes travis's setup normally work.  It makes
> sense to me since ./configure is expected to have more implicit or
> automatic behavior.  For "make", I kind of like that it doesn't depend
> on environment variables that I am not thinking about when debugging a
> reported build problem.
> 
> When building Git without autoconf, the corresponding place for
> settings is config.mak.  Would it make sense for the ci scripts to
> write a config.mak file?

A first I though it doesn't, but it turns out it acually does.

'ci/run-build-and-tests.sh' basically runs:

  make
  make test

I naively put a 'CC=$CC' argument at the end of the first command,
thinking it should Just Work...  but then that second 'make test' got
all clever on me, said "* new build flags", and then proceeded to
rebuild everything with the default 'cc'.  (With the additional
complication, that on Travis CI we actually run 'make --quiet test',
which rebuilds everything, well, quietly...  so the rebuild itself is
not even visible in the build logs.)

So, then it's either 'config.mak', or passing a 'CC=$CC' argument to
_all_ make commands, including those that are not supposed to build
anything, but only run the tests.  I find the latter aesthetically not
particularly pleasing.



Re: [PATCH v2 02/13] sha1-file: provide functions to look up hash algorithms

2018-10-17 Thread SZEDER Gábor
On Mon, Oct 15, 2018 at 02:18:49AM +, brian m. carlson wrote:
> There are several ways we might refer to a hash algorithm: by name, such
> as in the config file; by format ID, such as in a pack; or internally,
> by a pointer to the hash_algos array.  Provide functions to look up hash
> algorithms based on these various forms and return the internal constant
> used for them.  If conversion to another form is necessary, this
> internal constant can be used to look up the proper data in the
> hash_algos array.
> 
> Signed-off-by: brian m. carlson 
> ---
>  hash.h  | 13 +
>  sha1-file.c | 21 +
>  2 files changed, 34 insertions(+)
> 
> diff --git a/hash.h b/hash.h
> index 7c8238bc2e..90f4344619 100644
> --- a/hash.h
> +++ b/hash.h
> @@ -98,4 +98,17 @@ struct git_hash_algo {
>  };
>  extern const struct git_hash_algo hash_algos[GIT_HASH_NALGOS];
>  
> +/*
> + * Return a GIT_HASH_* constant based on the name.  Returns GIT_HASH_UNKNOWN 
> if
> + * the name doesn't match a known algorithm.
> + */
> +int hash_algo_by_name(const char *name);
> +/* Identical, except based on the format ID. */
> +int hash_algo_by_id(uint32_t format_id);
> +/* Identical, except for a pointer to struct git_hash_algo. */
> +inline int hash_algo_by_ptr(const struct git_hash_algo *p)

This has to be declared as static, otherwise the linker will error out
when building without optimization:

LINK git
libgit.a(commit-graph.o): In function `oid_version':
/home/szeder/src/git/commit-graph.c:48: undefined reference to 
`hash_algo_by_ptr'
libgit.a(hex.o): In function `hash_to_hex':
/home/szeder/src/git/hex.c:123: undefined reference to `hash_algo_by_ptr'
libgit.a(hex.o): In function `oid_to_hex':
/home/szeder/src/git/hex.c:128: undefined reference to `hash_algo_by_ptr'
collect2: error: ld returned 1 exit status
Makefile:2055: recipe for target 'git' failed
make: *** [git] Error 1


> +{
> + return p - hash_algos;
> +}
> +
>  #endif
> diff --git a/sha1-file.c b/sha1-file.c
> index e29825f259..3a75d515eb 100644
> --- a/sha1-file.c
> +++ b/sha1-file.c
> @@ -122,6 +122,27 @@ const char *empty_blob_oid_hex(void)
>   return oid_to_hex_r(buf, the_hash_algo->empty_blob);
>  }
>  
> +int hash_algo_by_name(const char *name)
> +{
> + int i;
> + if (!name)
> + return GIT_HASH_UNKNOWN;
> + for (i = 1; i < GIT_HASH_NALGOS; i++)
> + if (!strcmp(name, hash_algos[i].name))
> + return i;
> + return GIT_HASH_UNKNOWN;
> +}
> +
> +int hash_algo_by_id(uint32_t format_id)
> +{
> + int i;
> + for (i = 1; i < GIT_HASH_NALGOS; i++)
> + if (format_id == hash_algos[i].format_id)
> + return i;
> + return GIT_HASH_UNKNOWN;
> +}
> +
> +
>  /*
>   * This is meant to hold a *small* number of objects that you would
>   * want read_sha1_file() to be able to return, but yet you do not want


Re: On overriding make variables from the environment...

2018-10-16 Thread SZEDER Gábor
On Tue, Oct 16, 2018 at 02:54:56PM -0700, Jonathan Nieder wrote:
> SZEDER Gábor wrote:
> > Our Makefile has lines like these:
> >
> >   CFLAGS = -g -O2 -Wall
> >   CC = cc
> >   AR = ar
> >   SPATCH = spatch
> >
> > Note the use of '=', not '?='.
> [...]
> > I'm not sure what to do.  I'm fine with updating our 'ci/' scripts to
> > explicitly respect CC in the environment (either by running 'make
> > CC=$CC' or by writing $CC into 'config.mak').  Or I could update our
> > Makefile to use '?=' for specific variables, but:
> 
> That's a good question.  I don't have a strong opinion myself, so I
> tend to trust larger projects like Linux to have thought this through
> more, and they use 'CC = cc' as well.

I don't think Linux is a good example to follow in this case, see e.g.
6d62c983f7 (Makefile: Change the default compiler from "gcc" to "cc",
2011-12-20) (in short: Git is supposed to be buildable with compilers
other than GCC as well, while Linux not really, so they can hardcode
'CC = gcc').

Also, the projects I have on hand usually have 'CC = gcc' hardcoded in
their Makefiles, too, but those Makefiles were generated by their
./configure script (which in turn by ./autogen.sh...), and those tend
to write CC from the environment into the generated Makefiles.



Re: [PATCH v2 06/13] Add a build definition for Azure DevOps

2018-10-16 Thread SZEDER Gábor
On Mon, Oct 15, 2018 at 03:12:06AM -0700, Johannes Schindelin via GitGitGadget 
wrote:
> diff --git a/azure-pipelines.yml b/azure-pipelines.yml
> new file mode 100644
> index 00..b5749121d2
> --- /dev/null
> +++ b/azure-pipelines.yml
> @@ -0,0 +1,319 @@
> +resources:
> +- repo: self
> +  fetchDepth: 1
> +
> +phases:
> +- phase: linux_clang
> +  displayName: linux-clang
> +  condition: succeeded()
> +  queue:
> +name: Hosted Ubuntu 1604
> +  steps:
> +  - bash: |
> +   test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || 
> ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache 
> gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
> +
> +   sudo apt-get update &&
> +   sudo apt-get -y install git gcc make libssl-dev libcurl4-openssl-dev 
> libexpat-dev tcl tk gettext git-email zlib1g-dev apache2-bin &&
> +
> +   export CC=clang || exit 1
> +
> +   ci/install-dependencies.sh

I think you would want to 'exit 1' when this script fails.
This applies to other build jobs (erm, phases?) below as well.

> +   ci/run-build-and-tests.sh || {
> +   ci/print-test-failures.sh
> +   exit 1
> +   }
> +
> +   test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount 
> "$HOME/test-cache" || exit 1
> +displayName: 'ci/run-build-and-tests.sh'
> +env:
> +  GITFILESHAREPWD: $(gitfileshare.pwd)
> +  - task: PublishTestResults@2
> +displayName: 'Publish Test Results **/TEST-*.xml'
> +inputs:
> +  mergeTestResults: true
> +  testRunTitle: 'linux-clang'
> +  platform: Linux
> +  publishRunAttachments: false
> +condition: succeededOrFailed()
> +
> +- phase: linux_gcc
> +  displayName: linux-gcc
> +  condition: succeeded()
> +  queue:
> +name: Hosted Ubuntu 1604
> +  steps:
> +  - bash: |
> +   test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || 
> ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache 
> gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
> +
> +   sudo apt-get update &&
> +   sudo apt-get -y install git gcc make libssl-dev libcurl4-openssl-dev 
> libexpat-dev tcl tk gettext git-email zlib1g-dev apache2-bin || exit 1
> +

On Travis CI the Linux GCC build job uses gcc-8 instead of whatever
the default is in that old-ish Ubuntu LTS; see 37fa4b3c78 (travis-ci:
run gcc-8 on linux-gcc jobs, 2018-05-19).

> +   ci/install-dependencies.sh
> +   ci/run-build-and-tests.sh || {
> +   ci/print-test-failures.sh
> +   exit 1
> +   }
> +
> +   test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount 
> "$HOME/test-cache" || exit 1
> +displayName: 'ci/run-build-and-tests.sh'
> +env:
> +  GITFILESHAREPWD: $(gitfileshare.pwd)
> +  - task: PublishTestResults@2
> +displayName: 'Publish Test Results **/TEST-*.xml'
> +inputs:
> +  mergeTestResults: true
> +  testRunTitle: 'linux-gcc'
> +  platform: Linux
> +  publishRunAttachments: false
> +condition: succeededOrFailed()
> +
> +- phase: osx_clang
> +  displayName: osx-clang
> +  condition: succeeded()
> +  queue:
> +name: Hosted macOS
> +  steps:
> +  - bash: |
> +   test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || 
> ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache 
> gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
> +
> +   export CC=clang
> +
> +   ci/install-dependencies.sh
> +   ci/run-build-and-tests.sh || {
> +   ci/print-test-failures.sh
> +   exit 1
> +   }
> +
> +   test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || umount 
> "$HOME/test-cache" || exit 1
> +displayName: 'ci/run-build-and-tests.sh'
> +env:
> +  GITFILESHAREPWD: $(gitfileshare.pwd)
> +  - task: PublishTestResults@2
> +displayName: 'Publish Test Results **/TEST-*.xml'
> +inputs:
> +  mergeTestResults: true
> +  testRunTitle: 'osx-clang'
> +  platform: macOS
> +  publishRunAttachments: false
> +condition: succeededOrFailed()
> +
> +- phase: osx_gcc
> +  displayName: osx-gcc
> +  condition: succeeded()
> +  queue:
> +name: Hosted macOS
> +  steps:
> +  - bash: |
> +   test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || 
> ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache 
> gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
> +

Here you should 'export CC=gcc', because on macOS 'cc' is 'clang' by
default.

Note, however, that setting 'CC' in the environment alone has no
effect on the build process, it will still use 'cc'.  Keep an eye on
where this thread will lead to:

  https://public-inbox.org/git/20181016184537.gn19...@szeder.dev/T/#u

> +   ci/install-dependencies.sh
> +   ci/run-build-and-tests.sh || {
> +   ci/print-test-failures.sh
> +   exit 1
> +   }
> +
> +   test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || umount 
> "$HOME/test-cache" || exit 1
> +displayName: 

On overriding make variables from the environment...

2018-10-16 Thread SZEDER Gábor
Our Makefile has lines like these:

  CFLAGS = -g -O2 -Wall
  CC = cc
  AR = ar
  SPATCH = spatch

Note the use of '=', not '?='.  This means that these variables can be
overridden from the command line, i.e. 'make CC=gcc-X.Y' will build
with that particular GCC version, but not from the environment, i.e.
'CC=gcc-X.Y make' will still build with 'cc'.

This can be confusing for developers who come from other projects
where they used to run 'CC=whatever make'.

And our build jobs on Travis CI are badly affected by this.  We have
dedicated build jobs to build Git with GCC and Clang both on Linux and
OSX from the very beginning (522354d70f (Add Travis CI support,
2015-11-27)).  But guess how Travis CI specifies which compiler to
use!  With 'export CC=gcc' and 'export CC=clang', respectively.
Consequently, our Clang Linux build job has always used gcc, because
that's where 'cc' points at on Linux by default, while the GCC OSX
build job has always used Clang.  Oh, well.  Furthermore, see
37fa4b3c78 (travis-ci: run gcc-8 on linux-gcc jobs, 2018-05-19), where
Duy added an 'export CC=gcc-8' in an attempt to use a more modern
compiler, but this had no effect either.

I'm not sure what to do.  I'm fine with updating our 'ci/' scripts to
explicitly respect CC in the environment (either by running 'make
CC=$CC' or by writing $CC into 'config.mak').  Or I could update our
Makefile to use '?=' for specific variables, but:

  - I'm afraid to break somebody's setup relying on the current
behavior and CC having different values in the environment and in
'config.mak'.

  - Where to stop, IOW which variables should be set with '?='?
CFLAGS, LDFLAGS, CC, AR, ..., SPATCH, SPATCH_FLAGS?  Dunno.  We
already have 'STRIP ?= strip' and there are variables that are
checked explicitly (e.g. 'DEVELOPER=y make' works).

Note also that prior to b05701c5b4 (Make CFLAGS overridable from
make command line., 2005-08-06) our Makefile used 'CC?=gcc' as
well.



  1   2   3   4   5   6   7   8   9   10   >