Re: svntest.verify.compare_dump_files() under Python3

Daniel Shahaf Sun, 24 Nov 2024 08:11:27 -0800

Daniel Sahlberg wrote on Sat, 23 Nov 2024 16:14 +00:00:
> Den fre 22 nov. 2024 kl 14:33 skrev Daniel Shahaf <d...@daniel.shahaf.name>:
> > If I force an svnsync test to fail:
> >
> > [[[
⋮
> > TypeError: 'in <string>' requires string as left operand, not int
> > FAIL:  svnsync_tests.py 28: copy-revprops with removals
> > PASS:  svnsync_tests.py 29: fd leak during sync from serf to local
> > ]]]
> >
> > So, two things are wrong here:
> >
> > 1. The test harness doesn't actually print the diff.
> >
> >    That's probably because ndiff() is passed two lists of bytes objects,
> >    whereas it expects lists of str objects.
>
> I've tried to track down the root cause.


Thanks.

>                                          I think the issue was "introduced"
> in r1742950 <https://svn.apache.org/r1742950>: "Dumps are binary data,
> therefore explicitly use byte strings for all operations on them". But
> since most tests succeed - good code quality I assume - compare_dump_files
> never reach the ndiff() call in the above traceback and therefor it wasn't
> noticed that ndiff() want strings.
>
> I believe the easiest solution is to add list comprehensions decoding each
> line:
> -    print(''.join(ndiff(expected, actual)))
> +    print(''.join(ndiff([line.decode('utf-8') for line in expected], 
> [line.decode('utf-8') for line in actual])))
>
> I've forced the error in another way, by
> editing 
> subversion/tests/cmdline/svnsync_tests_data/delete-revprops.expected.dump
> changing cmpilato to cMpilato and running
>
> $ make check TESTS=subversion/tests/cmdline/svnsync_tests.py#28
>
> Before the change I got the same TypeError as above. After the change I get
> a failed test with the expected exception.
>
> After reviewing history, I believe the above change is the "obvious fix".
> One could argue if decode() should use some other encoding, I don't really
> know this. I've committed as r1922041.

I'm afraid this fix is incomplete.

The key point, as you quote from the log message of r1742950, is that
dumpfiles are binary data (because of how they represent files and
properties).  That means they are not encoded in /any/ encoding.

Formally, a dumpfile is a finite sequence of octets, which is not
guaranteed to represent a sequence of characters, a sequence of Unicode
codepoints, or anything of the sort.  It's the kind of file that one can
reasonably gzip, base64, or download, but not spell check, wrap to
80 columns, or convert from one encoding or character set to another.

In particular, the call «open('foo.dump', 'rb').read().decode(bar)» is
incorrect for all values of BAR.

Therefore, notwithstanding that calling .decode() improves the behaviour
on the case at hand, it's not The Fix.

Incidentally, decoding the binary dumpfile contents as UTF-8 in
particular is problematic for two reasons:

- Some dumpfiles are in fact not valid UTF-8.  On such dumpfiles,
  compare_dump_files() will raise UnicodeDecodeError (from the .decode()
  call, before ndiff() is ever entered).
  .
  Usually, dumping a repository where any file contains invalid UTF-8
  will result in a non-UTF-8 dump.

- It's not obvious whether .decode("utf-8") is a one-to-one transformation,
  which is to say, whether two different bytes values [sic] could possibly
  decode to the same str value, in which case compare_dump_files() might
  false negative.

So, what /should/ the test be doing?

- Compare the two lists of bytes objects as, well, two lists of bytes
  objects?  That would be nice, if Python's standard library enables it.

- .decode("iso-8859-1")?  This would work, sort of, insofar as it would
  be one-to-one and wouldn't raise for any input value.  However, when
  difflib prints a line, the octets printed would not equal the octets
  in the file: for instance, the byte 0xD7 in the file would likely be
  printed as the bytes 0xC3 0x97 [== 
b"\xD7".decode("iso-8859-1").encode("utf-8")].
  While this transformation /is/ reversible, and won't slow down debugging
  too much for anyone who knew to expect it and how to reverse it, for
  the test harness to do this would be rather unintuitive and surprising.

- I think the easiest solution is to change x.decode("utf-8") to
  repr(x).  repr() is guaranteed to return a string
  (https://docs.python.org/3/library/functions.html#repr), as required
  by difflib, and shouldn't introduce either false positives or false
  negatives.

- Also, the line should be changed to not exceed 80 columns.

>
> >
> > 2. The next test was run.
> >
> >    It shouldn't have run, because the exception wasn't just the actual
> >    output differing from the expected output, but an error in
> >    determining whether they differ or not.
> >
>
> I'm assuming you are looking for something like this:
> [[[
> Index: subversion/tests/cmdline/svntest/main.py
> ===================================================================
> --- subversion/tests/cmdline/svntest/main.py    (revision 1921978)
> +++ subversion/tests/cmdline/svntest/main.py    (working copy)
> @@ -2016,9 +2016,11 @@
>        logger.error('EXCEPTION: SystemExit(%d), skipping cleanup' % ex.code)
>        self._print_name(ex.code and 'FAIL: ' or 'PASS: ')
>        raise
> -    except:
> +    except BaseException as ex:
>        result = svntest.testcase.RESULT_FAIL
> -      logger.warn('CWD: %s' % os.getcwd(), exc_info=True)
> +      logger.warn('CWD: %s' % os.getcwd(), exc_info=False)
> +      logger.warn('Unexpected exception type %s' % type(ex))
> +      raise
>
>      os.chdir(saved_dir)
>      exit_code, result_text, result_benignity = self.pred.results(result)
> ]]]
> (Only the "raise" is actually needed here, I added some additional output
> to help making it clear that the exception type is unexpected).

First, what the incumbent «except» block does — catch "all other
exception types" and not re-raise them — is not best practice.  So,
that block should definitely be changed /somehow/.

More below.

> After this change
[...]
> [[[
⋮
> svnsync_tests.py...............................................................................make:
> *** [Makefile:553: check] Error 1
> ]]]
>
> Without this change
[...]
> [[[
⋮
> svnsync_tests.py...........................................................................................FAILURE
> At least one test FAILED, checking /home/daniel/svn_trunk/tests.log
> FAIL:  svnsync_tests.py 28: copy-revprops with removals
⋮
> make: *** [Makefile:553: check] Error 1
> ]]]
>
> I'm not sure I agree the changed behaviour is better. I like the summary
> and it is clearly indicated that one test failed.

I like the summary too, and I would like for it to stay whenever
possible… but let's untangle this one step at a time.

Running svnsync_tests 28 in pre-r1922041 trunk with s/cmpilato/cMpilato/
(and only that one-bit change; I didn't change anything else in the
dumpfile) reproduces the behaviour you describe here, both cases.

The thing is, in these circumstances delete_revprops() raises TypeError.
This exception type would be raised by the interpreter for certain bugs
in the test code (such as trying to index an int, as in «python3 -c
'42[42]'»).  Therefore, upon seeing TypeError, for all the harness knows
there's a bug /in the Python code doing the testing/, so the harness
(with the patch applied) bails out without printing PASS/FAIL for
test #28 and without running any subsequent tests, just as it would if
test #28 had done «assert False».  The test harness (with the patch
applied) is doing exactly what it should be doing.

In contrast, try running the one-bit-changed version of test #28 under
post-r1922041 trunk.  r1922041 causes delete_revprops() to raise
svntest.Failure where it previously raised TypeError¹.  The
svntest.Failure is caught by one of the «except» blocks above the diff
context, and the human-readable summary gets printed.  This happens both
with and without the change to the last «except» block; there is no
behaviour change.  Moreover, the observed behaviour seems correct to me.

Overall, the reason one case didn't print the human-readable summary was
because in that case the test raised a "possibly an internal error in
the test" exception rather than a "test failed" exception.  That's a bug
in delete_revprops()'s callee svntest.verify.compare_dump_files(), not
in the «except» block — and your r1922041 fixed that bug², whereupon the
human-readable summary gets printed both with and without the patch.

So, I think your changes to the «except» block are essentially correct.
Perhaps the caught class should be Exception (rather than BaseException),
perhaps the block should print more or fewer details, but overall, the
new block behaves as expected, doesn't silently catch and discard unknown
exceptions, and is hardly likely to break anything in tests that PASS…

… but might break tests that XFAIL and rely on the incumbent "catch
everything and discard it" behaviour.  Actually, s/might/does/: with the
«except» patch, the «make check» run is aborted upon reaching
revert_tests.py:revert_remove_added().

That test is decorated @XFail() and uses «assert» to verify its
expectations.  Presumably, when that test is run, one of the «assert»s
fails, so the test function throws AssertionError.  Without the «except»
patch, the AssertionError would be discarded; with the «except» patch,
the AssertionError would be treated as a genuine assertion failure,
i.e., as an indication that the internal state of the testing code has
reached an "unreachable" state.

Nevertheless, I still think the «except» patch is essentially correct;
it simply unearthed an incorrect use of «assert» in revert_tests.py.

Bottom line: +1 (concept).

>

¹ … because delete_revprops()'s dumpfile happens to be valid UTF-8,
  so the .decode() succeeds rather than raise UnicodeDecodeError.

² … for dumpfiles that happen to be valid UTF-8.

>                                                   Checking tests.log should
> make it clear what failed. If I'd change something, I'd probably just add
> the "Unexpected exception" message from the diff above. This would make it
> clear in tests.log that we got an unexpected exception:

You mean, to just add the message, without re-raising the exception or
limiting the caught classes?

I see several problems with that:

- As mentioned in the initial code review, catching everything and
  silently discarding it is not best practice.  And as mentioned in the
  subsequent analysis, the incumbent code literally catches and discards
  AssertionError exceptions.

- If we catch an arbitrary exception and /don't/ re-raise it, we don't
  actually have any assurance that the overall process will exit with
  a non-zero return code, do we?  And nobody ever reads tests.log unless
  «make check» exits non-zero (whether due to a FAIL, an XPASS, or an
  uncaught exception in the test harness).

> [[[
> DIFF of raw dumpfiles (including expected differences)
> --- expected
> +++ actual
⋮
>   File "/usr/lib/python3.11/difflib.py", line 1077, in IS_CHARACTER_JUNK
>     return ch in ws
>            ^^^^^^^^
> TypeError: 'in <string>' requires string as left operand, not int
> W: Unexpected exception type <class 'TypeError'>
> FAIL:  svnsync_tests.py 28: copy-revprops with removals
> ]]]
>
> Thoughts?

Summary:

- s/x.decode("utf-8")/repr(x)/

- not exceed 80 columns

- +1 (concept) on the «except» patch

  - revert_tests should probably be fixed before that patch is committed

And thanks for having taken the time to look into things in the first place, 
Daniel.

Daniel

Re: svntest.verify.compare_dump_files() under Python3

Reply via email to