[ 
https://issues.apache.org/jira/browse/HDFS-15869?focusedWorklogId=583922&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-583922
 ]

ASF GitHub Bot logged work on HDFS-15869:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 16/Apr/21 00:45
            Start Date: 16/Apr/21 00:45
    Worklog Time Spent: 10m 
      Work Description: hadoop-yetus commented on pull request #2737:
URL: https://github.com/apache/hadoop/pull/2737#issuecomment-820829126


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m  0s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 57s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  trunk passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  3s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m  5s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m  2s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  16m 20s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 54s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2737/5/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 463 unchanged 
- 0 fixed = 466 total (was 463)  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 10s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m  2s |  |  patch has no errors 
when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 458m 45s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2737/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 45s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 545m 26s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus 
|
   |   | hadoop.hdfs.TestViewDistributedFileSystemContract |
   |   | hadoop.hdfs.TestPersistBlocks |
   |   | hadoop.hdfs.TestDFSShell |
   |   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsVolumeList |
   |   | hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeWithHdfsScheme |
   |   | 
hadoop.hdfs.server.blockmanagement.TestAvailableSpaceRackFaultTolerantBPP |
   |   | hadoop.hdfs.TestStateAlignmentContextWithHA |
   |   | 
hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeHdfsFileSystemContract |
   |   | hadoop.hdfs.TestViewDistributedFileSystemWithMountLinks |
   |   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
   |   | hadoop.fs.TestSymlinkHdfsFileSystem |
   |   | hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS |
   |   | 
hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor |
   |   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
   |   | hadoop.hdfs.server.datanode.TestBlockScanner |
   |   | hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand |
   |   | hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots |
   |   | hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.TestHDFSFileSystemContract |
   |   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby |
   |   | hadoop.hdfs.web.TestWebHdfsFileSystemContract |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2737/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2737 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell xml |
   | uname | Linux 5028b1276e33 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 3c57512d104e3a92391c9a03ce4005a00267c07f |
   | Default Java | Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2737/5/testReport/ |
   | Max. process+thread count | 3044 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2737/5/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 583922)
    Time Spent: 2h 50m  (was: 2h 40m)

> Network issue while FSEditLogAsync is executing RpcEdit.logSyncNotify can 
> cause the namenode to hang
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-15869
>                 URL: https://issues.apache.org/jira/browse/HDFS-15869
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: fs async, namenode
>    Affects Versions: 3.2.2
>            Reporter: Haoze Wu
>            Priority: Critical
>              Labels: pull-request-available
>          Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
>     We were doing some testing of the latest Hadoop stable release 3.2.2 and 
> found some network issue can cause the namenode to hang even with the async 
> edit logging (FSEditLogAsync).
>     The workflow of the FSEditLogAsync thread is basically:
>  # get EditLog from a queue (line 229)
>  # do the transaction (line 232)
>  # sync the log if doSync (line 243)
>  # do logSyncNotify (line 248)
> {code:java}
> //hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogAsync.java
>   @Override
>   public void run() {
>     try {
>       while (true) {
>         boolean doSync;
>         Edit edit = dequeueEdit();                                         // 
> line 229
>         if (edit != null) {
>           // sync if requested by edit log.
>           doSync = edit.logEdit();                                         // 
> line 232
>           syncWaitQ.add(edit);
>         } else {
>           // sync when editq runs dry, but have edits pending a sync.
>           doSync = !syncWaitQ.isEmpty();
>         }
>         if (doSync) {
>           // normally edit log exceptions cause the NN to terminate, but tests
>           // relying on ExitUtil.terminate need to see the exception.
>           RuntimeException syncEx = null;
>           try {
>             logSync(getLastWrittenTxId());                                 // 
> line 243
>           } catch (RuntimeException ex) {
>             syncEx = ex;
>           }
>           while ((edit = syncWaitQ.poll()) != null) {
>             edit.logSyncNotify(syncEx);                                    // 
> line 248
>           }
>         }
>       }
>     } catch (InterruptedException ie) {
>       LOG.info(Thread.currentThread().getName() + " was interrupted, 
> exiting");
>     } catch (Throwable t) {
>       terminate(t);
>     }
>   }
> {code}
>     In terms of the step 4, FSEditLogAsync$RpcEdit.logSyncNotify is 
> essentially doing some network write (line 365).
> {code:java}
> //hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogAsync.java
>   private static class RpcEdit extends Edit {
>     // ...
>     @Override
>     public void logSyncNotify(RuntimeException syncEx) {
>       try {
>         if (syncEx == null) {
>           call.sendResponse();                                       // line 
> 365
>         } else {
>           call.abortResponse(syncEx);
>         }
>       } catch (Exception e) {} // don't care if not sent.
>     }
>     // ...
>   }{code}
>     If the sendResponse operation in line 365 gets stuck, then the whole 
> FSEditLogAsync thread is not able to proceed. In this case, the critical 
> logSync (line 243) can’t be executed, for the incoming transactions. Then the 
> namenode hangs. This is undesirable because FSEditLogAsync’s key feature is 
> asynchronous edit logging that is supposed to tolerate slow I/O.
>     To see why the sendResponse operation in line 365 may get stuck, here is 
> the stack trace:
> {code:java}
>      '(org.apache.hadoop.ipc.Server,channelWrite,3593)',
>      '(org.apache.hadoop.ipc.Server,access$1700,139)',
>      '(org.apache.hadoop.ipc.Server$Responder,processResponse,1657)',
>      '(org.apache.hadoop.ipc.Server$Responder,doRespond,1727)',
>      '(org.apache.hadoop.ipc.Server$Connection,sendResponse,2828)',
>      '(org.apache.hadoop.ipc.Server$Connection,access$300,1799)',
>      '(org.apache.hadoop.ipc.Server$RpcCall,doResponse,1111)',
>      '(org.apache.hadoop.ipc.Server$Call,doResponse,903)',
>      '(org.apache.hadoop.ipc.Server$Call,sendResponse,889)',
>      
> '(org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync$RpcEdit,logSyncNotify,365)',
>      '(org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync,run,248)',
>      '(java.lang.Thread,run,748)'
> {code}
>      The `channelWrite` function is defined as follows:
> {code:java}
> //hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
>   private int channelWrite(WritableByteChannel channel,
>                            ByteBuffer buffer) throws IOException {
>     
>     int count =  (buffer.remaining() <= NIO_BUFFER_LIMIT) ?
>                  channel.write(buffer) : channelIO(null, channel, buffer);  
> // line 3594
>     if (count > 0) {
>       rpcMetrics.incrSentBytes(count);
>     }
>     return count;
>   }{code}
>     The `channel.write(buffer)` operation in line 3594 may be slow. Although 
> for this specific stack trace, the channel is initialized in the non-blocking 
> mode, there is still a chance of being slow depending on native write 
> implementation in the OS (e.g., a kernel issue). Furthermore, the channelIO 
> invocation in line 3594 may also get stuck, since it waits until the buffer 
> is drained:
> {code:java}
> //hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
>   private static int channelIO(...) throws IOException {
>     //...
>     while (buf.remaining() > 0) {                                          // 
> line 3637
>       try {
>         int ioSize = Math.min(buf.remaining(), NIO_BUFFER_LIMIT);
>         buf.limit(buf.position() + ioSize);
>         
>         ret = (readCh == null) ? writeCh.write(buf) : readCh.read(buf);    // 
> line 3642
>         
>         if (ret < ioSize) {
>           break;
>         }      } finally {
>         buf.limit(originalLimit);        
>       }
>     }    int nBytes = initialRemaining - buf.remaining(); 
>     return (nBytes > 0) ? nBytes : ret;
>   }
> {code}
>      For example, if the payload is split in two batches, the second batch 
> will have to wait for the first batch to be sent out, which may encounter 
> high packet loss rate and thus slow I/O.
>     In summary, FSEditLogAsync is a critical service, but the potential delay 
> occurring in `edit.logSyncNotify(syncEx)` can block the FSEditLogAsync 
> thread. According to the comment 
> ([https://github.com/apache/hadoop/blob/rel/release-3.2.2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogAsync.java#L369]),
>  “don't care if not sent”, FSEditLogAsync thread in fact does not really need 
> to be concerned with the sync notification.
>     By the way, we found that this issue is related to HDFS-15486 in the old 
> version Hadoop.
> *Reproduction*
>     To show that the potential delay introduced by 
> `FSEditLogAsync$RpcEdit.logSyncNotify` can cause the namenode to hang, we 
> provide the scripts to reproduce the bug.
>     The script basically blocks the FSEditLogAsync thread when it’s invoking 
> `call.sendResponse()` in `FSEditLogAsync$RpcEdit#logSyncNotify`. Our 
> reproduction scripts guarantee that the delay is injected once and only once.
>     The reproduction script is provided in a gist 
> ([https://gist.github.com/functioner/891108ee54e24f155ac395a65e46cbfd]).
> *Fix*
>     Since the `logSyncNotify` is not a critical operation for 
> `FSEditLogAsync`, we propose to put the 
> `FSEditLogAsync$RpcEdit.logSyncNotify` invocation to a separate thread in 
> `FSEditLogAsync`. In this way, even if the notifications get stuck, they will 
> not affect the edit logging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to