[
https://issues.apache.org/jira/browse/HADOOP-17873?focusedWorklogId=647626&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647626
]
ASF GitHub Bot logged work on HADOOP-17873:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 07/Sep/21 23:48
Start Date: 07/Sep/21 23:48
Worklog Time Spent: 10m
Work Description: sumangala-patki commented on pull request #3341:
URL: https://github.com/apache/hadoop/pull/3341#issuecomment-914703372
I agree with the fact that any given process executes tests in a sequential
manner.
Some findings:
The failure can be consistently reproduced by adding a dummy test (in the
same class) to call the existing test, and running the full test suite. It
could be verified that each of the two tests has the correct number of
invocations of the read/write op increment function for the respective paths,
and no extra updates. Moreover, the statistics reset method works; all
read/write operation counts were verified to be 0 right after reset was called,
which is before each section (small file, large file). This would rule out
left-over states and reset issues. As stats updates happen within the driver
before store calls to read, response from the remote store will not affect the
values.
Therefore, this instance of test run should have passed considering no
interference between statistics reset and the value assertion, and with only
the correct number of operation increments.
For one failing scenario:
Expected value for large file read op count: 102 or 103
Actual value in streamOps test: 99
Actual value in dummy test: 198
Value according to logs for each test: 103
Therefore, one way this could have happened is that the two tests (possibly
along with any other test class involving read) were running in different
processes, but around the same time. This resulted in these tests modifying the
same statistics variable, which could also explain the drop in read count
despite the test having executed the expected number of read ops - the
statistics reset was called in one test while the other test was in the middle
of executing read.
Hence, we can conclude that any test running in parallel processes along
with the stats test may affect this test if it performs read/write. To avoid
this scenario, we can introduce an additional filesystem level statistics
variable that is not static, apart from the current static one that records
operations globally from all filesystems created in a session.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 647626)
Time Spent: 3h 40m (was: 3.5h)
> ABFS: Fix transient failures in ITestAbfsStreamStatistics and
> ITestAbfsRestOperationException
> ---------------------------------------------------------------------------------------------
>
> Key: HADOOP-17873
> URL: https://issues.apache.org/jira/browse/HADOOP-17873
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.3.1
> Reporter: Sumangala Patki
> Assignee: Sumangala Patki
> Priority: Major
> Labels: pull-request-available
> Time Spent: 3h 40m
> Remaining Estimate: 0h
>
> To address transient failures in the following test classes:
> * ITestAbfsStreamStatistics: Uses a filesystem level instance to record
> read/write statistics, which also tracks these operations in other tests.
> running parallelly. To be marked for sequential run only to avoid transient
> failure
> * ITestAbfsRestOperationException: The use of a static member to track retry
> count causes transient failures when two tests of this class happen to run
> together. Switch to non-static variable for assertions on retry count
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]