[GitHub] [flink] echauchot commented on pull request #17849: [FLINK-22775][connectors][cassandra][test] Try to fix flaky Timeout issue

GitBox Fri, 26 Nov 2021 05:30:06 -0800


echauchot commented on pull request #17849:
URL: https://github.com/apache/flink/pull/17849#issuecomment-979979676



   > Were you able to replicate the issue locally, or is this more of a 
"throw-stuff-at-a-wall-and-see-what-sticks" kind of situation? (Which I 
wouldn't mind for this particular test...)
   
   No, as this timeout is a flakiness issue which happens under load (see [my 
comment](https://issues.apache.org/jira/browse/FLINK-22775?focusedCommentId=17446552&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17446552)
 in the ticket) I did not manage to reproduce it out of 30 local runs. But I 
have pretty good confidence that avoiding the cluster to wait for a replicate 
on write would avoid the timeout under load. 
   
   As explain in my comment in the ticket I plan to monitor the ITest for some 
weeks and see if it is still flaky with my fix. If it is still flaky  then we 
could consider migrate the cassandra test cluster from embedded daemon to 
either testContainers (relies on docker so less sensitive to load) or ASF v2 
licenced test component such as Achilles (that I used in Apache Beam and that I 
contributed to) which has a lot of knobs for configuring the cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] echauchot commented on pull request #17849: [FLINK-22775][connectors][cassandra][test] Try to fix flaky Timeout issue

Reply via email to