[
https://issues.apache.org/jira/browse/HDFS-16143?focusedWorklogId=631631&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631631
]
ASF GitHub Bot logged work on HDFS-16143:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 30/Jul/21 13:35
Start Date: 30/Jul/21 13:35
Worklog Time Spent: 10m
Work Description: virajjasani commented on a change in pull request #3235:
URL: https://github.com/apache/hadoop/pull/3235#discussion_r679753606
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestEditLogTailer.java
##########
@@ -433,15 +440,28 @@ public void
testStandbyTriggersLogRollsWhenTailInProgressEdits()
NameNodeAdapter.mkdirs(active, getDirPath(i),
new PermissionStatus("test", "test",
new FsPermission((short)00755)), true);
+ // reset lastRollTimeMs in EditLogTailer.
+ active.getNamesystem().getEditLogTailer().resetLastRollTimeMs();
Review comment:
Thanks for taking a look @jojochuang.
`EditLogTailer` has a thread that keeps running to identify when is the
right time to trigger log rolling by calling Active Namenode's rollEditLog()
API.
```
private void doWork() {
long currentSleepTimeMs = sleepTimeMs;
while (shouldRun) {
long editsTailed = 0;
try {
// There's no point in triggering a log roll if the Standby hasn't
// read any more transactions since the last time a roll was
// triggered.
boolean triggeredLogRoll = false;
if (tooLongSinceLastLoad() &&
lastRollTriggerTxId < lastLoadedTxnId) {
triggerActiveLogRoll();
triggeredLogRoll = true;
}
...
...
```
What happens with this test is that by the time we create new dirs in this
for loop, this active thread would keep checking and intermittently keep
triggering log roll by making RPC calls to Active Namenode, and hence this test
would become flaky because the test expects Standby Namenode's last applied txn
id to be less than active Namenode's last written txn id within a time limit
duration. When it comes to how long EditLogTailer's thread keeps waiting to
trigger log roll depends on `lastRollTimeMs`.
In the above code, tooLongSinceLastLoad() refers to:
```
/**
* @return true if the configured log roll period has elapsed.
*/
private boolean tooLongSinceLastLoad() {
return logRollPeriodMs >= 0 &&
(monotonicNow() - lastRollTimeMs) > logRollPeriodMs;
}
```
Hence, until `lastRollTimeMs` worth of time is elapsed, log roll would not
be tailed, however, this always tends to be flaky because we have no control
over how much time mkdir calls in this for loop is going to take and in that
meantime, `lastRollTimeMs` worth of time can be elapsed easily, hence this test
is flaky. When we expect Standby Namenode's txnId to be less than that of
Active Namenode, it is not the case because log is rolled by above thread in
`EditLogTailer`.
Hence, it is important for this test to keep resetting `lastRollTimeMs`
while mkdir calls are getting executed so that we don't give chance for
`tooLongSinceLastLoad()` to be successful until we want it to be successful.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 631631)
Time Spent: 3.5h (was: 3h 20m)
> TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky
> -----------------------------------------------------------------------------
>
> Key: HDFS-16143
> URL: https://issues.apache.org/jira/browse/HDFS-16143
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: test
> Reporter: Akira Ajisaka
> Assignee: Viraj Jasani
> Priority: Major
> Labels: pull-request-available
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
>
> Time Spent: 3.5h
> Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3229/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
> {quote}
> [ERROR]
> testStandbyTriggersLogRollsWhenTailInProgressEdits[0](org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer)
> Time elapsed: 6.862 s <<< FAILURE!
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:87)
> at org.junit.Assert.assertTrue(Assert.java:42)
> at org.junit.Assert.assertTrue(Assert.java:53)
> at
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRollsWhenTailInProgressEdits(TestEditLogTailer.java:444)
> {quote}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]