devinbost edited a comment on issue #6198: Flaky-test: 
PulsarStateTest.testSinkState - wrong number of messages received
URL: https://github.com/apache/pulsar/issues/6198#issuecomment-581671178
 
 
   This issue seems to not occur after increasing the timeouts (e.g. doubling 
the retryCount and initSleepTimeInMillis for retryStrategically) when running 
them under stress. I pushed this change to my fork to see if it resolves the 
issue when running from Github CI. 
   
   The problem with the current approach to these tests is that there is a race 
condition between checking the status from the Admin API (which executes a REST 
call) and the method responsible for producing the messages. 
   It looks like `producer.send` blocks on the send operation but not on the 
receive operation. (Is this right?)
   Ideally, we'd have a way to block (at least for a period of time) until the 
messages are all received instead of needing to poll on the status. 
   However, such a change may not necessarily fix the test because we'd still 
be depending on execution to complete successfully after a period of time. (We 
may not have a choice because the lack of a timeout could cause the test to run 
indefinitely.)
   So, that brings us back to the idea of increasing the timeouts when polling 
the status to ensure we receive all the messages when using a slow test runner. 
   
   @sijie @jiazhai @yjshen Thoughts? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to