errose28 commented on pull request #2549: URL: https://github.com/apache/ozone/pull/2549#issuecomment-902128424
Thanks to everyone who has helped investigate this. Here is my current understanding of how this test is supposed to work: 1. Pipeline created 2. Log failure triggered on datanode 3. Datanode is expected to notify SCM of pipeline close without waiting for the heartbeat interval. 4. Test checks SCM event queue to see if the mock pipeline action handler it registered was called within the timeout. So although the mini ozone conf being set with the heartbeat interval in the wrong order is bad, it should have no bearing on this test. I verified this by setting the heartbeat interval to 10 seconds. The test behaves the same. By turning on trace logging for the EventQueue, we can see that even in the failure cases both the original PipelineActionHandler and the mocked PipelineActionHandler are triggered on SCM pretty quickly after the DN log fails. This gives me reasonable confidence that the code is working as expected and the error was in the test. So since the test seems to pass repeatedly with the new timeout, I am ok with the fix. However, I do not fully understand why we had to increase the timeout to 1 second. SCM event queue fires asynchronuously and I verified the heartbeat interval is not blocking the datanode. I guess it is possible that DN execution time + loopback RPC latency + SCM execution time takes up to 1 second sometimes, although this seems kind of high since it previously finished under 100ms ~29/30 times, and under 500ms ~139/140 times. TLDR I am reasonably confident the code is working correctly and this test fix is ok, but why the test fix was needed in the first place is still a bit of a mystery to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
