xianjingfeng commented on PR #673: URL: https://github.com/apache/incubator-uniffle/pull/673#issuecomment-1478871740
> > I think this problem has nothing to do with this PR. Because `ShuffleFlushManagerTest` is not in `integration-test` module, so i think they are independent of each other. > > Yes, I was trying to say, this QuorumTest is not very flaky. So I'm not sure increasing timeout will help. Glad to see you have found the root cause. Cc @zuston PTAL. There is another question that I can't understand. In the original logic, why does server1 stop in case3, and case4 can still run normally? And i found other probloms in `QuorumTest` 1. Sometimes `getShuffleResult` fail is because of `DEADLINE_EXCEEDED`, and not because server1 stop https://github.com/apache/incubator-uniffle/blob/7cb0b000cdf7260db3ac77f9d1742b8496c24cb2/integration-test/common/src/test/java/org/apache/uniffle/test/QuorumTest.java#L421-L429 2. `shuffleServers` is a static variable and elements inside will be modified. And this will affect other tests. https://github.com/apache/incubator-uniffle/blob/d60d675d38c833b99b012a1f4c726a012ce93463/integration-test/common/src/test/java/org/apache/uniffle/test/IntegrationTestBase.java#L60-L61 In general, modifying the timeout period will help expose the problems in advance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
