Hi guys, I created https://github.com/apache/activemq/pull/622 PR about this and you can see that Jenkins is happy now. The full build took about 120mn (2h) on Jenkins.
Basically what I did in the PR: - remove activemq-unit-tests and itests (Karaf, Spring3) from the default reactor - introduce full.test profile that build all modules including unit tests and itests The full.test profile is not use in Jenkinsfile, meaning that the PR executes all tests but not activemq-unit-tests modules neither itests. I think it’s acceptable for PR (and it already takes 2 hours ;)). I would like to introduce a "static" build on ci-builds.apache.org (not via Jenkinsfile) executed every week and doing a full build (including full.test profile). Thoughts ? Regards JB > Le 15 mars 2021 à 08:20, Jean-Baptiste Onofre <[email protected]> a écrit : > > Hi guys, > > I have create the following Jira with the tests I found "flaky" (in a full > build, not necessary single execution, it can also depends of the machine, > that’s why I tested with several docker setup in terms of CPU and memory): > > AMQ-8190: DuplexAdvisoryRaceTest is failing (Jonathan said he gonna take a > look) > AMQ-8189: CachedLDAPAuthorizationModuleTest is failing > AMQ-8188: AMQ5266SingleDestTest is failing > > There’s a test failure in leveldb module, but it’s not a big deal as I have > the PR ready to remove leveldb (https://github.com/apache/activemq/pull/593). > > I’m also retesting StompNIOSSLTest, it seems way more stable thanks to Chris > > I also created AMQ-8191 (linked with previous Jira) about cleanup on the > profiles, fast.test profile introduction and usage on Jenkins, and exclude > the failing tests waiting to be fixed (and reinclude them at that time). > > AMQ-8191 is almost ready, I’m testing. > > Regards > JB > >> Le 14 mars 2021 à 06:04, Jean-Baptiste Onofre <[email protected]> a écrit : >> >> Hi guys, >> >> I’ve updated my local branch according to your comments: >> >> 1. I’ve cleanup the profiles and introduce/rename a fast profile that >> executes all unit tests in modules but exclude the activemq-unit-tests and >> karaf-itests. >> 2. I’m keeping the smoke test profile >> 3. I’ve created a tobefixed profile that include all flaky tests I’ve >> identified >> 4. I’ve updated Jenkinsfile to use fast profile on PR >> >> I will create the PR soon. >> >> Regards >> JB >> >>> Le 13 mars 2021 à 06:05, Jean-Baptiste Onofre <[email protected]> a écrit : >>> >>> Hi, >>> >>> We already have "fast" profile, and it’s good idea to use this profile on >>> Jenkins by default and move some tests here. >>> >>> For instance, I don’t think it’s require to launch all activemq-unit-test >>> by default but I would keep the tests in each module (they are fast and >>> doesn’t need whole broker infra). >>> >>> About RetryRule, I did that in Karaf as well, let me see if it helps for >>> ActiveMQ. >>> >>> Thanks ! >>> I will improve this way. >>> >>> Regards >>> JB >>> >>>> Le 12 mars 2021 à 20:31, Clebert Suconic <[email protected]> a >>>> écrit : >>>> >>>> You should instead have a fast profile, with a subset of the testsuite >>>> to run on every commit and branch for these cases. I looked on Jenkins >>>> and having many builds taking 3 Hours each won't really scale on the >>>> lab anyway. Failures will only make things worse there. >>>> >>>> The lab is usually not powerful for long running tests. >>>> >>>> And a full profile that should run as part of a full run. (say.. once >>>> a day instead of every commit), or any interval you chose. >>>> >>>> I don't think you should hide tests though.. as that is like pushing >>>> dirt under the rug.. (even if you say to enable it later... as in >>>> anything in life temporary solutions endup being definitive usually). >>>> >>>> As any System dealing with times and asynchronous flaky and races are >>>> part of the day. One thing I did in ActiveMQ Artemis was to write a >>>> Rule where the test is retried. You could also add retries to tests in >>>> cases where it is acceptable... but be careful to not just hide bugs >>>> away in this case as well. >>>> >>>> If you are interested, on artemis, Look for usages on >>>> https://github.com/apache/activemq-artemis/blob/master/artemis-commons/src/test/java/org/apache/activemq/artemis/utils/RetryRule.java >>>> >>>> >>>> You need to activate a profile in artemis for the retryRule to work. >>>> >>>> On Fri, Mar 12, 2021 at 1:56 PM JB Onofré <[email protected]> wrote: >>>>> >>>>> Yes agree. I’m launching new builds ;) >>>>> >>>>>> Le 12 mars 2021 à 19:51, Christopher Shannon >>>>>> <[email protected]> a écrit : >>>>>> >>>>>> Just running it by itself on the command line and also in the IDE. The >>>>>> full >>>>>> build takes a while and if it's breaking with that then it's probably >>>>>> some >>>>>> other test that isn't cleaning up properly in between runs. >>>>>> >>>>>>> On Fri, Mar 12, 2021 at 1:47 PM JB Onofré <[email protected]> wrote: >>>>>>> >>>>>>> Did you try in a full build or the test individually ? I’m running a new >>>>>>> build. >>>>>>> >>>>>>>> Le 12 mars 2021 à 19:38, Christopher Shannon < >>>>>>> [email protected]> a écrit : >>>>>>>> >>>>>>>> I've been running the DurableSyncNetworkBridgeTest several times on my >>>>>>> box >>>>>>>> and it always passes. >>>>>>>> >>>>>>>>> On Fri, Mar 12, 2021 at 1:25 PM Christopher Shannon < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>> Ideally it would be better to fix tests than to simply exclude them. >>>>>>> These >>>>>>>>> tests were added for a reason I would presume (I know I had worked on >>>>>>> the >>>>>>>>> durable sync stuff in the past) so randomly turning off tests could >>>>>>> lead to >>>>>>>>> missing errors. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Mar 12, 2021 at 12:57 PM Jean-Baptiste Onofre >>>>>>>>> <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> I’m adding these tests to be fixed/improved: >>>>>>>>>> >>>>>>>>>> FailoverDurableSubTransactionTest.testFailoverCommitListener >>>>>>>>>> DurableSyncNetworkBridgeTest.testRemoveSubscriptionPropagate >>>>>>>>>> DurableSyncNetworkBridgeTest.testRemoveSubscriptionWithBridgeOffline >>>>>>>>>> >>>>>>>>>> Let me create the Jira and create a PR to exclude the tests and >>>>>>>>>> verify >>>>>>>>>> Jenkins is happy. >>>>>>>>>> >>>>>>>>>> Regards >>>>>>>>>> JB >>>>>>>>>> >>>>>>>>>>> Le 12 mars 2021 à 16:14, Jonathan Gallimore < >>>>>>>>>> [email protected]> a écrit : >>>>>>>>>>> >>>>>>>>>>> I'm +1 on the actions :). >>>>>>>>>>> >>>>>>>>>>> Jon >>>>>>>>>>> >>>>>>>>>>> On Fri, Mar 12, 2021 at 3:11 PM Jean-Baptiste Onofre >>>>>>>>>>> <[email protected] >>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Sure, thanks for the help ! >>>>>>>>>>>> >>>>>>>>>>>> Just waiting for some feedback before starting the "actions" ;) >>>>>>>>>>>> >>>>>>>>>>>> Regards >>>>>>>>>>>> JB >>>>>>>>>>>> >>>>>>>>>>>>> Le 12 mars 2021 à 14:29, Jonathan Gallimore < >>>>>>>>>>>> [email protected]> a écrit : >>>>>>>>>>>>> >>>>>>>>>>>>> I ran into this test failing yesterday: >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>> activemq-unit-tests/src/test/java/org/apache/activemq/usecases/DuplexAdvisoryRaceTest.java >>>>>>>>>>>>> - I'd be happy to try and contribute a fix. Would you like to >>>>>>>>>>>>> assign >>>>>>>>>> the >>>>>>>>>>>>> JIRA to me? >>>>>>>>>>>>> >>>>>>>>>>>>> Jon >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Mar 12, 2021 at 12:58 PM Jean-Baptiste Onofre < >>>>>>>>>> [email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi guys, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Now that we have Jenkinsfile in our repo, and we use Jenkins >>>>>>>>>> pipeline, >>>>>>>>>>>> we >>>>>>>>>>>>>> dramatically improved our build: the build is executed for each >>>>>>>>>>>>>> PullRequests or commit on the main branch. >>>>>>>>>>>>>> >>>>>>>>>>>>>> However, we have lot of failing tests, causing quite >>>>>>>>>>>>>> systematically >>>>>>>>>> the >>>>>>>>>>>>>> build failing on ci-builds.apache.org. >>>>>>>>>>>>>> >>>>>>>>>>>>>> We really need to have a clean, accurate and stable build: it >>>>>>>>>>>>>> will >>>>>>>>>>>> improve >>>>>>>>>>>>>> the issue detection and simplify the review, especially for >>>>>>>>>>>> PullRequests. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I ran several builds on my machine (with different docker >>>>>>> containers) >>>>>>>>>>>> and >>>>>>>>>>>>>> I already identified some failing/flaky tests: >>>>>>>>>>>>>> >>>>>>>>>>>>>> - >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>> activemq-leveldb-store/src/test/java/org/apache/activemq/leveldb/test/ElectingLevelDBStoreTest.java >>>>>>>>>>>>>> is not a big deal as I have a PR removing leveled completely >>>>>>>>>>>>>> - >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>> activemq-stomp/src/test/java/org/apache/activemq/transport/stomp/Stomp11NIOSSLTest.java. >>>>>>>>>>>>>> Chris did an improvement, but I still have some flakiness here. >>>>>>>>>>>>>> - >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>> activemq-unit-tests/src/test/java/org/apache/activemq/usecases/DuplexAdvisoryRaceTest.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> I propose the following action plan: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. Create the Jira for each failing/flaky tests >>>>>>>>>>>>>> 2. Exclude the tests (in surefire plugin configuration) to have a >>>>>>>>>> "green >>>>>>>>>>>>>> light" on Jenkins. >>>>>>>>>>>>>> 3. For each Jira, we work on a PullRequest, to be sure that >>>>>>>>>>>>>> Jenkins >>>>>>>>>> is >>>>>>>>>>>>>> still "happy". >>>>>>>>>>>>>> >>>>>>>>>>>>>> Anyone willing to help on (3) is welcome ! >>>>>>>>>>>>>> >>>>>>>>>>>>>> If there’s no objection, I will start with (1) and (2). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Regards >>>>>>>>>>>>>> JB >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >>>> >>>> -- >>>> Clebert Suconic >>> >> >
