sumangala-patki commented on pull request #3341: URL: https://github.com/apache/hadoop/pull/3341#issuecomment-914703372
I agree with the fact that any given process executes tests in a sequential manner. Some findings: The failure can be consistently reproduced by adding a dummy test (in the same class) to call the existing test, and running the full test suite. It could be verified that each of the two tests has the correct number of invocations of the read/write op increment function for the respective paths, and no extra updates. Moreover, the statistics reset method works; all read/write operation counts were verified to be 0 right after reset was called, which is before each section (small file, large file). This would rule out left-over states and reset issues. As stats updates happen within the driver before store calls to read, response from the remote store will not affect the values. Therefore, this instance of test run should have passed considering no interference between statistics reset and the value assertion, and with only the correct number of operation increments. For one failing scenario: Expected value for large file read op count: 102 or 103 Actual value in streamOps test: 99 Actual value in dummy test: 198 Value according to logs for each test: 103 Therefore, one way this could have happened is that the two tests (possibly along with any other test class involving read) were running in different processes, but around the same time. This resulted in these tests modifying the same statistics variable, which could also explain the drop in read count despite the test having executed the expected number of read ops - the statistics reset was called in one test while the other test was in the middle of executing read. Hence, we can conclude that any test running in parallel processes along with the stats test may affect this test if it performs read/write. To avoid this scenario, we can introduce an additional filesystem level statistics variable that is not static, apart from the current static one that records operations globally from all filesystems created in a session. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
