[
https://issues.apache.org/jira/browse/KUDU-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Henke reassigned KUDU-2619:
---------------------------------
Assignee: Grant Henke
> Track Java test failures in the flaky test dashboard
> ----------------------------------------------------
>
> Key: KUDU-2619
> URL: https://issues.apache.org/jira/browse/KUDU-2619
> Project: Kudu
> Issue Type: Improvement
> Components: java, test
> Affects Versions: n/a
> Reporter: Adar Dembo
> Assignee: Grant Henke
> Priority: Major
>
> Right now our flaky test tracking infrastructure only incorporates C++ tests
> using GTest; we should extend it to include Java tests too.
> I spent some time on this recently and I wanted to collect my notes in one
> place.
> For reference, here's how C++ test reporting works:
> # The build-and-test.sh script rebuilds thirdparty dependencies, builds Kudu,
> and invokes all test suites, optionally using dist-test. After all tests have
> been run, it also collects any dist-test artifacts and logs so that all of
> the test results are available in one place.
> # The run-test.sh script is effectively the C++ "test runner". It is
> responsible for running a test binary, retrying it if it fails, and calling
> report-test.sh after each test run (success or failure). Importantly,
> report-test.sh is invoked once per test binary (not individual test), and on
> test success we don't wait for the script to finish, because we don't care as
> much about collecting successes.
> # The report-test.sh collects some basic information about the test
> environment (such as the git hash used, whether ASAN or TSAN was enabled,
> etc.), then uses curl to send the information to the test result server.
> # The test result server will store the test run result in a database, and
> will query that database to produce a dashboard.
> There are several problems to solve if we're going to replicate this for Java:
> # There's no equivalent to run-test.sh. The entry point for running the Java
> test suite is Gradle, but in dist-test, the test invocation is actually done
> via 'java org.junit.runner.JUnitCore'. Note that C++ test reporting is
> currently also incompatible with dist-test, so the Java tests aren't unique
> in that respect.
> # It'd be some work to replace report-test.sh with a Java equivalent.
> My thinking is that we should move test reporting from run-test.sh into
> build-and-test.sh:
> # It's a good separation of concerns. The test "runners" are responsible for
> running and maybe retrying tests, while the test "aggregator"
> (build-and-test.sh) is responsible for reporting.
> # It's more performant. You can imagine building a test_result_server.py
> endpoint for reporting en masse, which would cut down on the number of round
> trips. That's especially important if we start reporting individual test
> results (as opposed to test _suite_ results).
> # It means the reporting logic need only be written once.
> # It was always a bit unexpected to find reporting logic buried in
> run-test.sh. I mean, it made sense for rapid prototyping but it never really
> made that much sense to me.
> So then the problem is ensuring that, after all tests have run, we have the
> right JUnit XML and log files for every test that ran, including retries,
> which is more tractable, and doable for dist-test environments too.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)