Re: [RFC] test-lib: detect common misuse of test_expect_failure

2016-10-17 Thread Junio C Hamano
Jeff King  writes:

> I like the general idea, but I'm not sure how this would interact with
> the tests in t that test the test suite.

I tried but gave up adding a new test for this to t ;-)

>>  test_expect_failure () {
>> +if test "$test_in_progress" = 1
>> +then
>> +error "bug in the test script: did you mean test_must_fail 
>> instead of test_expect_failure?"
>> +fi
>
> This follows existing practice for things like the &&-lint-checker, and
> bails out on the whole test script.

Yes, you guessed correctly where the above came from.

> That sometimes makes it hard to find
> the problematic test, especially if you're running via something like
> "prove", because it doesn't make valid TAP output.

Yeah, true.

> It might be nicer if we just said "this test is malformed, and therefore
> fails", and then you get all the usual niceties for recording and
> finding the failed test.
>
> I don't think it would be robust enough to try to propagate the error up
> to the outer test_expect_success block (and anyway, you'd also want to
> know about it in a test_expect_failure block; it's a bug in the test,
> not a known breakage). But perhaps error() could dump some TAP-like
> output with a "virtual" failed test.
>
> Something like:
> ...
> which lets "make prove" collect the broken test number.
>
> It would perhaps need to cover the case when $test_count is "0"
> separately. I dunno. It would be nicer still if we could continue
> running other tests in the script, but I think it's impossible to
> robustly jump back to the outer script.
>
> These kinds of "bug in the test suite" are presumably rare enough that
> the niceties don't matter that much, but I trigger the &&-checker
> reasonably frequently (that and test_line_count, because I can never
> remember the correct invocation).
>
> Anyway. That's all orthogonal to your patch. I just wondered if we could
> do better, but AFAICT the right way to do better is to hook into
> error(), which means your patch would not have to care exactly how it
> fails.

Yeah, the change to error() may be a good thing to do, but it has
quite a many callers in t/*lib*.sh and definitely deserves to be a
separate patch, not tied to this single test.


Re: [RFC] test-lib: detect common misuse of test_expect_failure

2016-10-14 Thread Jeff King
On Fri, Oct 14, 2016 at 03:38:41PM -0700, Junio C Hamano wrote:

> It is a very easy mistake to make to say test_expect_failure when
> making sure a step in the test fails, which must be spelled
> "test_must_fail".  By introducing a toggle $test_in_progress that is
> turned on at the beginning of test_start_() and off at the end of
> test_finish_() helper, we can detect this fairly easily.
> 
> Strictly speaking, writing "test_expect_success" inside another
> test_expect_success (or inside test_expect_failure for that matter)
> can be detected with the same mechanism if we really wanted to, but
> that is a lot less likely confusion, so let's not bother.

I like the general idea, but I'm not sure how this would interact with
the tests in t that test the test suite. It looks like that always
happens in a full sub-shell invocation (via run_sub_test_lib_test), so
we're OK as long as test_in_progress is not exported (and obviously the
subshell cannot accidentally overwrite our variable with a 0 when it is
finished).

> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index fdaeb3a96b..fc8c10a061 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -381,6 +381,10 @@ test_verify_prereq () {
>  }
>  
>  test_expect_failure () {
> + if test "$test_in_progress" = 1
> + then
> + error "bug in the test script: did you mean test_must_fail 
> instead of test_expect_failure?"
> + fi

This follows existing practice for things like the &&-lint-checker, and
bails out on the whole test script. That sometimes makes it hard to find
the problematic test, especially if you're running via something like
"prove", because it doesn't make valid TAP output.

It might be nicer if we just said "this test is malformed, and therefore
fails", and then you get all the usual niceties for recording and
finding the failed test.

I don't think it would be robust enough to try to propagate the error up
to the outer test_expect_success block (and anyway, you'd also want to
know about it in a test_expect_failure block; it's a bug in the test,
not a known breakage). But perhaps error() could dump some TAP-like
output with a "virtual" failed test.

Something like:

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 11562bd..dc6b1f5 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -299,9 +299,8 @@ TERM=dumb
 export TERM
 
 error () {
-   say_color error "error: $*"
-   GIT_EXIT_OK=t
-   exit 1
+   test_failure_ "$@"
+   test_done
 }
 
 say () {
@@ -600,7 +599,7 @@ test_run_ () {
# code of other programs
test_eval_ "(exit 117) && $1"
if test "$?" != 117; then
-   error "bug in the test script: broken &&-chain: $1"
+   error "bug in the test script: broken &&-chain" "$1"
fi
trace=$trace_tmp
fi

which lets "make prove" collect the broken test number.

It would perhaps need to cover the case when $test_count is "0"
separately. I dunno. It would be nicer still if we could continue
running other tests in the script, but I think it's impossible to
robustly jump back to the outer script.

These kinds of "bug in the test suite" are presumably rare enough that
the niceties don't matter that much, but I trigger the &&-checker
reasonably frequently (that and test_line_count, because I can never
remember the correct invocation).

Anyway. That's all orthogonal to your patch. I just wondered if we could
do better, but AFAICT the right way to do better is to hook into
error(), which means your patch would not have to care exactly how it
fails.

-Peff


[RFC] test-lib: detect common misuse of test_expect_failure

2016-10-14 Thread Junio C Hamano
It is a very easy mistake to make to say test_expect_failure when
making sure a step in the test fails, which must be spelled
"test_must_fail".  By introducing a toggle $test_in_progress that is
turned on at the beginning of test_start_() and off at the end of
test_finish_() helper, we can detect this fairly easily.

Strictly speaking, writing "test_expect_success" inside another
test_expect_success (or inside test_expect_failure for that matter)
can be detected with the same mechanism if we really wanted to, but
that is a lot less likely confusion, so let's not bother.

Signed-off-by: Junio C Hamano 
---

 * It is somewhat embarrassing to admit that I had to stare at the
   offending code for more than 5 minutes to notice what went wrong
   to come up with ;
   if we had something like this, it would have helped.

 t/test-lib-functions.sh | 4 
 t/test-lib.sh   | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index fdaeb3a96b..fc8c10a061 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -381,6 +381,10 @@ test_verify_prereq () {
 }
 
 test_expect_failure () {
+   if test "$test_in_progress" = 1
+   then
+   error "bug in the test script: did you mean test_must_fail 
instead of test_expect_failure?"
+   fi
test_start_
test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
test "$#" = 2 ||
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 11562bde10..4c360216e5 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -344,6 +344,7 @@ test_count=0
 test_fixed=0
 test_broken=0
 test_success=0
+test_in_progress=0
 
 test_external_has_tap=0
 
@@ -625,6 +626,7 @@ test_run_ () {
 }
 
 test_start_ () {
+   test_in_progress=1
test_count=$(($test_count+1))
maybe_setup_verbose
maybe_setup_valgrind
@@ -634,6 +636,7 @@ test_finish_ () {
echo >&3 ""
maybe_teardown_valgrind
maybe_teardown_verbose
+   test_in_progress=0
 }
 
 test_skip () {