[ https://issues.apache.org/jira/browse/HDFS-15719?focusedWorklogId=522120&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-522120 ]
ASF GitHub Bot logged work on HDFS-15719: ----------------------------------------- Author: ASF GitHub Bot Created on: 09/Dec/20 08:38 Start Date: 09/Dec/20 08:38 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2533: URL: https://github.com/apache/hadoop/pull/2533#issuecomment-741622026 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 26m 20s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | |||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 13m 43s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 20m 48s | | trunk passed | | +1 :green_heart: | compile | 20m 2s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | +1 :green_heart: | compile | 17m 16s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 41s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 8s | | trunk passed | | +1 :green_heart: | shadedclient | 23m 13s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 36s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | +1 :green_heart: | javadoc | 2m 6s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | +0 :ok: | spotbugs | 2m 35s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 33s | | branch/hadoop-project no findbugs output file (findbugsXml.xml) | |||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 59s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 12s | | the patch passed | | +1 :green_heart: | compile | 22m 58s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | +1 :green_heart: | javac | 22m 58s | | the patch passed | | +1 :green_heart: | compile | 20m 6s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | +1 :green_heart: | javac | 20m 6s | | the patch passed | | +1 :green_heart: | checkstyle | 3m 30s | | the patch passed | | +1 :green_heart: | mvnsite | 2m 17s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 1s | | The patch has no ill-formed XML file. | | -1 :x: | shadedclient | 20m 17s | | patch has errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 37s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | +1 :green_heart: | javadoc | 2m 10s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | +0 :ok: | findbugs | 0m 31s | | hadoop-project has no data from findbugs | |||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 28s | | hadoop-project in the patch passed. | | -1 :x: | unit | 10m 43s | [/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2533/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt) | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 56s | | The patch does not generate ASF License warnings. | | | | 224m 2s | | | | Reason | Tests | |-------:|:------| | Failed junit tests | hadoop.metrics2.source.TestJvmMetrics | | | hadoop.ha.TestZKFailoverController | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2533/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2533 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 3e550b4a690c 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / aaf9e3d320a | | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2533/1/testReport/ | | Max. process+thread count | 1601 (vs. ulimit of 5500) | | modules | C: hadoop-project hadoop-common-project/hadoop-common U: . | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2533/1/console | | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 | | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 522120) Time Spent: 20m (was: 10m) > [Hadoop 3] Both NameNodes can crash simultaneously due to the short JN socket > timeout > ------------------------------------------------------------------------------------- > > Key: HDFS-15719 > URL: https://issues.apache.org/jira/browse/HDFS-15719 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.0.0 > Reporter: Wei-Chiu Chuang > Priority: Critical > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > After Hadoop 3, we migrated Jetty 6 to Jetty 9. It was implemented in > HADOOP-10075. > However, HADOOP-10075 erroneously set the HttpServer2 socket idle timeout too > low. > We replaced SelectChannelConnector.setLowResourceMaxIdleTime() with > ServerConnector.setIdleTimeout() but they aren't the same. > Essentially, the HttpServer2's idle timeout was the default timeout set by > Jetty 6, which is 200 seconds. After Hadoop 3, the idle timeout is set to 10 > seconds, which is unreasonable for JN. If NameNodes try to download a big > edit log from JournalNodes (say a few hundred MB), it is likely to exceed 10 > seconds. When it happens, both NN crashes and there's no way to workaround > unless you apply the patch in HADOOP-15696 to add a config switch for the > idle timeout. Fortunately, it doesn't happen a lot. > Propose: bump the idle timeout default to 200 seconds to match the behavior > in Jetty 6. (Jetty 9 reduces the default idle timeout to 30 seconds, which is > not suitable for JN) > Other things to consider: > 1. fsck serverlet? (somehow I suspect this is related to the socket timeout > reported in HDFS-7175) > 2. webhdfs, httpfs? --> we've also received reports that webhdfs can timeout. > so having a longer timeout makes sense here. > 2. kms? will the longer timeout cause more lingering sockets? > Thanks [~zhenshan.wen] for the discussion. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org