[ 
https://issues.apache.org/jira/browse/HDFS-16143?focusedWorklogId=632441&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-632441
 ]

ASF GitHub Bot logged work on HDFS-16143:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Aug/21 15:53
            Start Date: 02/Aug/21 15:53
    Worklog Time Spent: 10m 
      Work Description: xkrogen commented on a change in pull request #3235:
URL: https://github.com/apache/hadoop/pull/3235#discussion_r681077707



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestEditLogTailer.java
##########
@@ -488,4 +509,19 @@ private static MiniDFSCluster 
createMiniDFSCluster(Configuration conf,
         .build();
     return cluster;
   }
+
+  private static final class TestTimer extends Timer {

Review comment:
       I think we already have a class called `ManualTimer` or `FakeTimer`, can 
you check again?

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
##########
@@ -423,21 +430,39 @@ void triggerActiveLogRoll() {
     try {
       future = rollEditsRpcExecutor.submit(getNameNodeProxy());
       future.get(rollEditsTimeoutMs, TimeUnit.MILLISECONDS);
-      lastRollTimeMs = monotonicNow();
+      this.lastRollTimeMs = timer.monotonicNow();
       lastRollTriggerTxId = lastLoadedTxnId;
-    } catch (ExecutionException e) {
+    } catch (ExecutionException | InterruptedException e) {
       LOG.warn("Unable to trigger a roll of the active NN", e);
     } catch (TimeoutException e) {
-      if (future != null) {
-        future.cancel(true);
-      }
+      future.cancel(true);
       LOG.warn(String.format(
           "Unable to finish rolling edits in %d ms", rollEditsTimeoutMs));
-    } catch (InterruptedException e) {
-      LOG.warn("Unable to trigger a roll of the active NN", e);
     }
   }
 
+  /**
+   * This is only to be used by tests. For source code, the only way to
+   * set timer is by using EditLogTailer constructor.
+   *
+   * @param timer Timer instance provided by tests.
+   */
+  @VisibleForTesting
+  @SuppressWarnings("checkstyle:hiddenfield")

Review comment:
       can you change the name of the parameter (e.g. `newTimer`) instead of 
suppressing the warning?

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestEditLogTailer.java
##########
@@ -394,9 +396,10 @@ public void 
testStandbyTriggersLogRollsWhenTailInProgressEdits()
     // Time in seconds to wait before checking if edit logs are rolled while
     // expecting no edit log roll
     final int noLogRollWaitTime = 2;
+
     // Time in seconds to wait before checking if edit logs are rolled while
-    // expecting edit log roll
-    final int logRollWaitTime = 3;
+    // expecting edit log roll.
+    final int logRollWaitTime = 7;

Review comment:
       seems we don't need to change this anymore now that we're using the fake 
timer?

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
##########
@@ -423,21 +423,22 @@ void triggerActiveLogRoll() {
     try {
       future = rollEditsRpcExecutor.submit(getNameNodeProxy());
       future.get(rollEditsTimeoutMs, TimeUnit.MILLISECONDS);
-      lastRollTimeMs = monotonicNow();
+      resetLastRollTimeMs();
       lastRollTriggerTxId = lastLoadedTxnId;
-    } catch (ExecutionException e) {
+    } catch (ExecutionException | InterruptedException e) {
       LOG.warn("Unable to trigger a roll of the active NN", e);
     } catch (TimeoutException e) {
-      if (future != null) {
-        future.cancel(true);
-      }
+      future.cancel(true);

Review comment:
       But also this is unrelated to this JIRA, can you put up a separate 
JIRA/PR for it? Particular since this JIRA is just aimed at fixing test 
flakiness, it's better to minimize any production code changes.

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestEditLogTailer.java
##########
@@ -429,19 +432,43 @@ public void 
testStandbyTriggersLogRollsWhenTailInProgressEdits()
       waitForStandbyToCatchUpWithInProgressEdits(standby, activeTxId,
           standbyCatchupWaitTime);
 
+      long curTime = standby.getNamesystem().getEditLogTailer().getTimer()
+          .monotonicNow();
+      // Set TestTimer that returns (curTime + 1 sec) ms while calling
+      // timer.monotonicNow(). This returned value is used by
+      // EditLogTailer.tooLongSinceLastLoad() to determine whether
+      // EditLogTailer should roll the log. Since this test is time
+      // and resource sensitive, we are using custom Timer instance
+      // so that tooLongSinceLastLoad() always returns false
+      // until we change Timer instance in EditLogTailer.

Review comment:
       Can we simplify this comment to just:
   ```
   // move time forward by 1 second, which is not enough to trigger a log roll
   ```
   Better yet, we could save the log roll time in line 405 above:
   ```
   long logRollPeriod = standbyCatchupWaitTime + noLogRollWaitTime + 1;
   ```
   and then set the time to `curTime + TimeUnit.SECONDS.toMillis(logRollPeriod 
/ 2)` -- I think this should be sufficiently self-explanatory to not require a 
comment, since you can clearly see we're only moving time by half the log-roll 
period.

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
##########
@@ -172,14 +171,22 @@
    */
   private final long maxTxnsPerLock;
 
-  public EditLogTailer(FSNamesystem namesystem, Configuration conf) {
+  /**
+   * Timer instance to be set only using constructor.
+   * Only tests can reassign this by using setTimerForTests().
+   * For source code, this timer instance should be treated as final.
+   */
+  private Timer timer;
+
+  public EditLogTailer(FSNamesystem namesystem, Configuration conf,
+      Timer timer) {

Review comment:
       IMO it's better to add a new constructor to avoid changes in the calling 
classes:
   
   ```
   public EditLogTailer(FSNamesystem namesystem, Configuration conf) {
     this(namesystem, conf, new Timer());
   }
   
   EditLogTailer(FSNamesystem namesystem, Configuration conf,
       Timer timer) {
     this.tailerThread = ...
     this.conf = conf;
     this.namesystem = namesystem;
     this.timer = timer;
     ...
   }
   ```
   
   But upon looking further, it seems that we never actually pass in a custom 
timer to the constructor; we only change it via `setTimer`, so I think we can 
get rid of the version that has a custom timer?

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestEditLogTailer.java
##########
@@ -429,19 +432,43 @@ public void 
testStandbyTriggersLogRollsWhenTailInProgressEdits()
       waitForStandbyToCatchUpWithInProgressEdits(standby, activeTxId,
           standbyCatchupWaitTime);
 
+      long curTime = standby.getNamesystem().getEditLogTailer().getTimer()
+          .monotonicNow();
+      // Set TestTimer that returns (curTime + 1 sec) ms while calling
+      // timer.monotonicNow(). This returned value is used by
+      // EditLogTailer.tooLongSinceLastLoad() to determine whether
+      // EditLogTailer should roll the log. Since this test is time
+      // and resource sensitive, we are using custom Timer instance
+      // so that tooLongSinceLastLoad() always returns false
+      // until we change Timer instance in EditLogTailer.
+      standby.getNamesystem().getEditLogTailer().setTimerForTest(
+          new TestTimer(curTime + TimeUnit.SECONDS.toMillis(1)));

Review comment:
       Instead of changing the timer instance multiple times, can you store a 
reference to the timer, then adjust the time returned by the timer?
   ```
   TestTimer timer = new TestTimer(...);
   standby.getNamesystem().getEditLogTailer().setTimerForTest(timer);
   timer.setTime(...);
   ...
   timer.setTime(...);
   ```

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
##########
@@ -423,21 +423,22 @@ void triggerActiveLogRoll() {
     try {
       future = rollEditsRpcExecutor.submit(getNameNodeProxy());
       future.get(rollEditsTimeoutMs, TimeUnit.MILLISECONDS);
-      lastRollTimeMs = monotonicNow();
+      resetLastRollTimeMs();
       lastRollTriggerTxId = lastLoadedTxnId;
-    } catch (ExecutionException e) {
+    } catch (ExecutionException | InterruptedException e) {
       LOG.warn("Unable to trigger a roll of the active NN", e);
     } catch (TimeoutException e) {
-      if (future != null) {
-        future.cancel(true);
-      }
+      future.cancel(true);

Review comment:
       I see. Both `ExecutionException` and `InterruptedException` will also 
only be thrown by `Future#get`, so I think the try-catch should probably look 
like:
   ```
   Future<Void> future = rollEditsRpcExecutor.submit(getNameNodeProxy());
   try {
     future.get(....)
     ...
   } catch (...) {
     ...
   }
   ```
   This will make it more clear why `future` will not be null when we get into 
the `catch` block.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 632441)
    Time Spent: 5h  (was: 4h 50m)

> TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-16143
>                 URL: https://issues.apache.org/jira/browse/HDFS-16143
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: test
>            Reporter: Akira Ajisaka
>            Assignee: Viraj Jasani
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
>
>          Time Spent: 5h
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3229/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
> {quote}
> [ERROR] 
> testStandbyTriggersLogRollsWhenTailInProgressEdits[0](org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer)
>   Time elapsed: 6.862 s  <<< FAILURE!
> java.lang.AssertionError
>       at org.junit.Assert.fail(Assert.java:87)
>       at org.junit.Assert.assertTrue(Assert.java:42)
>       at org.junit.Assert.assertTrue(Assert.java:53)
>       at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRollsWhenTailInProgressEdits(TestEditLogTailer.java:444)
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to