devinbost opened a new issue #6137: Unstable Github tests for Pulsar
URL: https://github.com/apache/pulsar/issues/6137
 
 
   **Description**
   Pulsar tests regularly succeed locally, but after submitting a Pull Request 
to merge changes into master, some of the tests (usually < 3 at a time) will 
randomly fail. 
   
   Examples include, but are not limited to:
   
   In CI - CPP, Python Tests / cpp-tests:
   
   -     BasicEndToEndTest.testPatternEmptyUnsubscribe
   -     BasicEndToEndTest.testSinglePartitionRoutingPolicy
   
   In CI - Unit - Brokers:
   -     org.apache.pulsar.client.api.SimpleProducerConsumerTest
           -     org.apache.pulsar.client.api.SimpleProducerConsumerTest.setup
   -     org.apache.pulsar.client.impl.BrokerClientIntegrationTest
           -     testUnsupportedBatchMessageConsumer
   
   In CI - Unit - Flaky:
   -     org.apache.pulsar.client.kafka.test.KafkaProducerSimpleConsumerTest
           -     testPulsarKafkaProducerWithSerializer
   -     org.apache.pulsar.functions.worker.PulsarFunctionE2ESecurityTest
           -     testAuthorizationWithAnonymousUser
   
   In CI - Unit - Proxy:
   -     org.apache.pulsar.proxy.server.ProxyParserTest
           -     
org.apache.pulsar.proxy.server.ProxyParserTest.testRegexSubscription
   
   
   Regarding `org.apache.pulsar.client.api.SimpleProducerConsumerTest.setup`, I 
found an interesting exception message:
    ```
   
org.apache.pulsar.client.api.SimpleProducerConsumerTest.setup(org.apache.pulsar.client.api.SimpleProducerConsumerTest)
   165[ERROR] Run 1: 
SimpleProducerConsumerTest.setup:108->MockedPulsarServiceBaseTest.internalSetup:107->MockedPulsarServiceBaseTest.init:144->MockedPulsarServiceBaseTest.startBroker:195->MockedPulsarServiceBaseTest.startBroker:218
 ยป WrongTypeOfReturnValue
   ```
   
   This message suggests that there's a race condition in the testing framework 
(or our use of it). 
   Perhaps there are known concurrency bugs in some of the versions of the test 
libraries we are using. 
   
   There is also a risk that there are concurrency bugs in the actual framework 
that only appear in certain environments. If this is the case, then these bugs 
could result in instability for certain users in production environments. 
   
   **Expected behavior**
   Tests should not randomly fail when run by Jenkins or the Github CI Action 
test runner after submitting a Pull Request. These random failures 
significantly slow down the rate of being able to merge PRs and raise the 
possibility of other potential risks. 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to