On Wed, Oct 8, 2025 at 4:40 AM Mark Thomas <[email protected]> wrote:

> On 07/10/2025 20:31, Coty Sutherland wrote:
> > Hi all,
> >
> > Before I started implementation and submitted a PR, I wanted to share a
> > proposal for some new test targets in the build.xml. The idea is to
> > introduce structured test categories to improve developer productivity
> and
> > CI efficiency through faster, more targeted test execution without the
> need
> > to know or use the various fileset patterns.
>
> I'm far more concerned about developer productivity than I am CI usage.
> As a project, we make relatively very little usage of CI and are under
> no pressure to reduce that usage.
>

Ack.


>
> Generally:
>
> Test duration varies so much depending on the resources available, I
> don't think it makes a good component of the definition of a test set. I
> suggest using a % of a complete test run on the same hardware with the
> same number of test threads. E.g the smoke test runs in 5% of the time
> required for the full test suite.
>
> I don't think the test definitions should be defining things like
> running in parallel. There are some CI environments that we use that
> don't have 6 cores. The test definitions should define the tests and the
> degree of parallelism controlled with test.threads set appropriately for
> the environment.
>
> Currently, we use GitHub actions for the sanity check and BuildBot for
> the full test run. I see no reason to change that although we should
> check that we are using appropriate values for test.threads on both
> platforms. No objection to reviewing what we include in the sanity check.
>
> > *New Test Targets*
> >
> > * ant smoke-test - Runs fast smoke tests (~30 seconds) that verify basic
> > functionality across all major Tomcat components including server
> startup,
> > core engine, etc. Tests essential class loading and API availability.
> > * ant test-quick - Runs unit tests and critical integration tests (~5
> > minutes) for development validation. Excludes integration scenarios,
> > performance tests, and complex deployment tests.
>
> I'm not sure I see the need for both of these.
>

Yeah, I'd be fine with just one target/profile.


>
> > * ant test-components - Runs full component testing (~20 minutes) with
> unit
> > tests for specific components in parallel (6 components; each with their
> > own test target). Excludes cross-component integration tests.
> > * ant test-integration - Runs cross-component integration tests (~30
> > minutes) including WebSocket, SSL/TLS, clustering, session management,
> > authentication, valves, filters, startup lifecycle, and JSP-servlet
> > integration.
>
> I think it might be hard to draw a definitive line between component and
> integration. I tend to look at these as different levels of granularity.
>

Agreed. I thought about groupings in several different ways and it was hard
to find the line between the two.


>
> If Dimitris's idea of a single test target with a parameter to specify
> the set(s) of tests to run is possible that could be really good.
> Especially if it handled overlapping sets and only ran a test once.
>

Yeah, I like that idea better. I didn't even think about the -projecthelp
output being bloated :) I'll look into that and go that route, if possible.


> > * ant test-performance - Runs performance tests for benchmarking and
> > optimization, including timing-sensitive code paths, memory usage, and
> > throughput tests. Isolated from other categories to prevent flaky
> failures.
>
> Do we have any flaky performance tests at the moment?
>
> The performance tests tend to fall into one of two categories. Those
> that compare more than one way of doing the same thing and confirm that
> the current Tomcat implementation is using the fastest. And those that
> just provide raw numbers for a given operation. I think the former need
> to stay in the complete test suite. The latter are (or should be)
> already excluded.
>

Agreed. The target's main purpose was to provide an obvious way to run the
performance tests for benchmarking.


>
> > * ant test-tribes-system - Runs comprehensive clustering system tests
> (30+
> > minutes) for the Tribes clustering component. These are high-resource,
> > long-running integration scenarios that were previously excluded from the
> > main test suite due to regular failures, now available for thorough
> > validation when working on clustering functionality.
> Tribes is just another component.
> The frequency of the regular failures is low enough that (as far as I
> recall) none of the tribes tests are currently excluded from the full
> test run.
>

Ack. I think I have just excluded them forever because it's not a feature I
really ever use/look at.


>
> > Note: times mentioned above are guesstimates based on running with a few
> > test threads since I haven't implemented anything yet.
> >
> > *Component-Specific Test Targets*
> >
> > * ant test-component-catalina - Runs all Catalina tests
> > * ant test-component-coyote - Runs all Coyote tests
> > * ant test-component-jasper - Runs all JSP tests
> > * ant test-component-el - Runs all EL tests
> > * ant test-component-tomcat - Runs all Tomcat utilities, WebSocket,
> > logging, and JNDI tests
> > * ant test-component-servlet - Runs all servlet tests
> >
> > *Key Benefits*
> >
> > For Developers:
> > - 30-second smoke tests feedback vs 15-30 minute full suite (depending on
> > available test threads)
>
> A target for a quick test would be useful. I tend to just do a build.
>
> > - Run only relevant component tests for the systems you're working on
>
> I can see the benefits to developers of having test targets that cover
> particular functionality. Generally, I run all the tests in a package
> through the IDE but there are certainly times when I need to run tests
> in several packages that having a single test target would be helpful.
>
> > - More obvious test targets with specific purpose
> > - Quick validation with a shorter feedback loop before commits
> >
> > For CI:
> > - Shorter test runs across multiple jobs/platforms to reduce costs (if
> > there were any)
> > - Replace 10-20 minute "smoketest" with 30-second validation (or 5 min
> > quick tests) for faster builds and a notable decrease in compute time
> used
> > for every commit
> > - Avoids the need to update the ci.yml to exclude new tests that may
> cause
> > longer runtimes
> >
> > *Implementation*
> >
> > The implementation of this plan would follow existing conventions to
> > utilize the same JVM args, properties, and exclude patterns as current
> > runtests macro preserving compatibility with the current test workflows.
> > There wouldn't be any change to existing test targets, only the new ones
> > introduced. The only new change to the suite would be the addition of the
> > SmokeTest designation in filenames if we wanted to include new tests for
> > that target; everything else is just creating targets from existing
> > filesets/patterns.
> I'm not at all a fan of defining inclusion in the smoke tests by file
> name. I'd much rather see that group of tests defined by being
> explicitly listed in build.xml.
>

OK.


>
> > Thoughts? If there aren't any objections I'll start working on a PR :D
>
> It seems that developers are likely to have much more powerful machines
> than most of the CI systems we use. Should that factor into our
> thinking? Does it actually change anything?
>

The primary driver of my interest here was to categorize the tests into
ways that would be useful for someone that's very new to the project. I
usually run the full suite with 11-15 test threads, so it's fast but not
everyone has the hardware for that. The secondary purpose was CI speed and
potential savings there. Even though we're not paying for it or waiting, I
was considering the value of lower electrical usage for compute time.


>
> Finally, when running with lots of threads the individual time taken by
> a test can dominate the timing (e.g. a long running test that starts
> near then end can continue long after all the other threads have
> stopped). As we increase the number of test.threads we use both in CI
> and locally, we might want to look at those long running tests and see
> if we can break them up.
>

Makes sense.

Thanks for the feedback, I'll pull something together once I'm freed back
up for tomcat stuff again. Cheers!


>
> Mark
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to