[jira] [Work logged] (HDFS-15683) Allow configuring DISK/ARCHIVE capacity for individual volumes
[ https://issues.apache.org/jira/browse/HDFS-15683?focusedWorklogId=548031=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-548031 ] ASF GitHub Bot logged work on HDFS-15683: - Author: ASF GitHub Bot Created on: 05/Feb/21 04:25 Start Date: 05/Feb/21 04:25 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2625: URL: https://github.com/apache/hadoop/pull/2625#issuecomment-773778344 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 14m 26s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 5 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 1s | | trunk passed | | +1 :green_heart: | compile | 1m 24s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 14s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | checkstyle | 1m 22s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 22s | | trunk passed | | +1 :green_heart: | shadedclient | 15m 51s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 54s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 26s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +0 :ok: | spotbugs | 3m 5s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 3s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 11s | | the patch passed | | +1 :green_heart: | compile | 1m 11s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | -1 :x: | javac | 1m 11s | [/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2625/8/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt) | hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 14 new + 580 unchanged - 14 fixed = 594 total (was 594) | | +1 :green_heart: | compile | 1m 5s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | javac | 1m 5s | | the patch passed | | +1 :green_heart: | checkstyle | 1m 13s | | hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 820 unchanged - 1 fixed = 820 total (was 821) | | +1 :green_heart: | mvnsite | 1m 10s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 2s | | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 12m 41s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 48s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 21s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | findbugs | 3m 0s | | the patch passed | _ Other Tests _ | | -1 :x: | unit | 192m 10s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2625/8/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 41s | | The patch does not generate ASF License warnings. | | | | 291m 0s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.TestDFSClientExcludedNodes | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2625/8/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2625 | | Optional Tests | dupname asflicense compile javac javadoc
[jira] [Commented] (HDFS-15684) EC: Call recoverLease on DFSStripedOutputStream close exception
[ https://issues.apache.org/jira/browse/HDFS-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279306#comment-17279306 ] Hui Fei commented on HDFS-15684: [~wanghongbing] cherry-pick failed, could you please give patches on branch 3.3, 3.2, 3.1? Thanks > EC: Call recoverLease on DFSStripedOutputStream close exception > --- > > Key: HDFS-15684 > URL: https://issues.apache.org/jira/browse/HDFS-15684 > Project: Hadoop HDFS > Issue Type: Improvement > Components: dfsclient, ec >Reporter: Hongbing Wang >Assignee: Hongbing Wang >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15684.001.patch, HDFS-15684.002.patch, > HDFS-15684.003.patch > > > -HDFS-14694- add a feature that call recoverLease operation automatically > when DFSOutputSteam close encounters exception. When we wanted to apply this > feature to our cluster, we found that it does not support EC files. > I think this feature should take effect whether replica files or EC files. > This Jira proposes to make it effective when in the case of EC files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15798) EC: Reconstruct task failed, and It would be XmitsInProgress of DN has negative number
[ https://issues.apache.org/jira/browse/HDFS-15798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279300#comment-17279300 ] Hui Fei commented on HDFS-15798: Commit to trunk , and cherry-pick to branch-3.3, 3.2,3.1 [~haiyang Hu] [~sodonnell] Thanks again. > EC: Reconstruct task failed, and It would be XmitsInProgress of DN has > negative number > -- > > Key: HDFS-15798 > URL: https://issues.apache.org/jira/browse/HDFS-15798 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3 > > Attachments: HDFS-15798.001.patch, HDFS-15798.002.patch, > HDFS-15798.003.patch > > > The EC reconstruct task failed, and the decrementXmitsInProgress of > processErasureCodingTasks operation abnormal value ; > It would be XmitsInProgress of DN has negative number, it affects NN chooses > pending tasks based on the ratio between the lengths of replication and > erasure-coded block queues. > {code:java} > // 1.ErasureCodingWorker.java > public void processErasureCodingTasks( > Collection ecTasks) { > for (BlockECReconstructionInfo reconInfo : ecTasks) { > int xmitsSubmitted = 0; > try { > ... > // It may throw IllegalArgumentException from task#stripedReader > // constructor. > final StripedBlockReconstructor task = > new StripedBlockReconstructor(this, stripedReconInfo); > if (task.hasValidTargets()) { > // See HDFS-12044. We increase xmitsInProgress even the task is only > // enqueued, so that > // 1) NN will not send more tasks than what DN can execute and > // 2) DN will not throw away reconstruction tasks, and instead keeps > // an unbounded number of tasks in the executor's task queue. > xmitsSubmitted = Math.max((int)(task.getXmits() * xmitWeight), 1); > getDatanode().incrementXmitsInProcess(xmitsSubmitted); // task start > increment > stripedReconstructionPool.submit(task); > } else { > LOG.warn("No missing internal block. Skip reconstruction for task:{}", > reconInfo); > } > } catch (Throwable e) { > getDatanode().decrementXmitsInProgress(xmitsSubmitted); // task failed > decrement, XmitsInProgress is decremented by the previous value > LOG.warn("Failed to reconstruct striped block {}", > reconInfo.getExtendedBlock().getLocalBlock(), e); > } > } > } > // 2.StripedBlockReconstructor.java > public void run() { > try { > initDecoderIfNecessary(); >... > } catch (Throwable e) { > LOG.warn("Failed to reconstruct striped block: {}", getBlockGroup(), e); > getDatanode().getMetrics().incrECFailedReconstructionTasks(); > } finally { > float xmitWeight = getErasureCodingWorker().getXmitWeight(); > // if the xmits is smaller than 1, the xmitsSubmitted should be set to 1 > // because if it set to zero, we cannot to measure the xmits submitted > int xmitsSubmitted = Math.max((int) (getXmits() * xmitWeight), 1); > getDatanode().decrementXmitsInProgress(xmitsSubmitted); // task complete > decrement > ... > } > }{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15798) EC: Reconstruct task failed, and It would be XmitsInProgress of DN has negative number
[ https://issues.apache.org/jira/browse/HDFS-15798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Fei updated HDFS-15798: --- Fix Version/s: 3.2.3 3.1.5 3.4.0 3.3.1 Resolution: Fixed Status: Resolved (was: Patch Available) > EC: Reconstruct task failed, and It would be XmitsInProgress of DN has > negative number > -- > > Key: HDFS-15798 > URL: https://issues.apache.org/jira/browse/HDFS-15798 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3 > > Attachments: HDFS-15798.001.patch, HDFS-15798.002.patch, > HDFS-15798.003.patch > > > The EC reconstruct task failed, and the decrementXmitsInProgress of > processErasureCodingTasks operation abnormal value ; > It would be XmitsInProgress of DN has negative number, it affects NN chooses > pending tasks based on the ratio between the lengths of replication and > erasure-coded block queues. > {code:java} > // 1.ErasureCodingWorker.java > public void processErasureCodingTasks( > Collection ecTasks) { > for (BlockECReconstructionInfo reconInfo : ecTasks) { > int xmitsSubmitted = 0; > try { > ... > // It may throw IllegalArgumentException from task#stripedReader > // constructor. > final StripedBlockReconstructor task = > new StripedBlockReconstructor(this, stripedReconInfo); > if (task.hasValidTargets()) { > // See HDFS-12044. We increase xmitsInProgress even the task is only > // enqueued, so that > // 1) NN will not send more tasks than what DN can execute and > // 2) DN will not throw away reconstruction tasks, and instead keeps > // an unbounded number of tasks in the executor's task queue. > xmitsSubmitted = Math.max((int)(task.getXmits() * xmitWeight), 1); > getDatanode().incrementXmitsInProcess(xmitsSubmitted); // task start > increment > stripedReconstructionPool.submit(task); > } else { > LOG.warn("No missing internal block. Skip reconstruction for task:{}", > reconInfo); > } > } catch (Throwable e) { > getDatanode().decrementXmitsInProgress(xmitsSubmitted); // task failed > decrement, XmitsInProgress is decremented by the previous value > LOG.warn("Failed to reconstruct striped block {}", > reconInfo.getExtendedBlock().getLocalBlock(), e); > } > } > } > // 2.StripedBlockReconstructor.java > public void run() { > try { > initDecoderIfNecessary(); >... > } catch (Throwable e) { > LOG.warn("Failed to reconstruct striped block: {}", getBlockGroup(), e); > getDatanode().getMetrics().incrECFailedReconstructionTasks(); > } finally { > float xmitWeight = getErasureCodingWorker().getXmitWeight(); > // if the xmits is smaller than 1, the xmitsSubmitted should be set to 1 > // because if it set to zero, we cannot to measure the xmits submitted > int xmitsSubmitted = Math.max((int) (getXmits() * xmitWeight), 1); > getDatanode().decrementXmitsInProgress(xmitsSubmitted); // task complete > decrement > ... > } > }{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Fei resolved HDFS-15819. Fix Version/s: 3.4.0 Resolution: Fixed > Fix a codestyle issue for TestQuotaByStorageType > > > Key: HDFS-15819 > URL: https://issues.apache.org/jira/browse/HDFS-15819 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Baolong Mao >Assignee: Baolong Mao >Priority: Trivial > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Fei updated HDFS-15819: --- Priority: Trivial (was: Major) > Fix a codestyle issue for TestQuotaByStorageType > > > Key: HDFS-15819 > URL: https://issues.apache.org/jira/browse/HDFS-15819 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Baolong Mao >Assignee: Baolong Mao >Priority: Trivial > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Fei updated HDFS-15819: --- Issue Type: Improvement (was: New Feature) > Fix a codestyle issue for TestQuotaByStorageType > > > Key: HDFS-15819 > URL: https://issues.apache.org/jira/browse/HDFS-15819 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Baolong Mao >Assignee: Baolong Mao >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-15819?focusedWorklogId=548007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-548007 ] ASF GitHub Bot logged work on HDFS-15819: - Author: ASF GitHub Bot Created on: 05/Feb/21 02:11 Start Date: 05/Feb/21 02:11 Worklog Time Spent: 10m Work Description: ferhui merged pull request #2681: URL: https://github.com/apache/hadoop/pull/2681 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 548007) Time Spent: 40m (was: 0.5h) > Fix a codestyle issue for TestQuotaByStorageType > > > Key: HDFS-15819 > URL: https://issues.apache.org/jira/browse/HDFS-15819 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Baolong Mao >Assignee: Baolong Mao >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-15819?focusedWorklogId=548004=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-548004 ] ASF GitHub Bot logged work on HDFS-15819: - Author: ASF GitHub Bot Created on: 05/Feb/21 02:08 Start Date: 05/Feb/21 02:08 Worklog Time Spent: 10m Work Description: ferhui commented on pull request #2681: URL: https://github.com/apache/hadoop/pull/2681#issuecomment-773731300 @maobaolong Thanks for fix, merged This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 548004) Time Spent: 0.5h (was: 20m) > Fix a codestyle issue for TestQuotaByStorageType > > > Key: HDFS-15819 > URL: https://issues.apache.org/jira/browse/HDFS-15819 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Baolong Mao >Assignee: Baolong Mao >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit
[ https://issues.apache.org/jira/browse/HDFS-15820?focusedWorklogId=547968=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547968 ] ASF GitHub Bot logged work on HDFS-15820: - Author: ASF GitHub Bot Created on: 05/Feb/21 00:01 Start Date: 05/Feb/21 00:01 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2682: URL: https://github.com/apache/hadoop/pull/2682#issuecomment-773683449 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 31s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 34m 17s | | trunk passed | | +1 :green_heart: | compile | 1m 30s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 24s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | checkstyle | 1m 12s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 25s | | trunk passed | | +1 :green_heart: | shadedclient | 16m 19s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 56s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 36s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +0 :ok: | spotbugs | 3m 4s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 2s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 11s | | the patch passed | | +1 :green_heart: | compile | 1m 11s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 11s | | the patch passed | | +1 :green_heart: | compile | 1m 5s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | javac | 1m 5s | | the patch passed | | -0 :warning: | checkstyle | 1m 0s | [/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2682/2/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 299 unchanged - 0 fixed = 301 total (was 299) | | +1 :green_heart: | mvnsite | 1m 11s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 12m 49s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 48s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 26s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | findbugs | 3m 3s | | the patch passed | _ Other Tests _ | | -1 :x: | unit | 209m 7s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2682/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 46s | | The patch does not generate ASF License warnings. | | | | 297m 23s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy | | | hadoop.hdfs.server.namenode.ha.TestObserverNode | | | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | | | hadoop.hdfs.TestDecommissionWithStriped | | | hadoop.hdfs.TestSetrepDecreasing | | | hadoop.hdfs.TestErasureCodingPolicyWithSnapshot | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2682/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2682 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9cd05c4da773 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6
[jira] [Commented] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming
[ https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279230#comment-17279230 ] Kihwal Lee commented on HDFS-15813: --- +1. Unit test failures seem unrelated. If you can't find existing Jira for the failures, please file one for each. I've looked at {{TestUnderReplicatedBlocks#testSetRepIncWithUnderReplicatedBlocks}} briefly. It appears to be a test issue. The test artificially invalidated a replica on a node, but before the test made further progress, the NN fixed the under-replication by having another node send the block to the same node. The test then went ahead and removed it from the NN's data structure (blocksmap) and called {{setReplication()}}. The NN picked two nodes, but one of them was the node that already has the block replica. It was only missing in NN's data structure. Again, this happened because the NN fixed the under-replication between the test deleting the replica and modifying the nn data structure. The replication failed with {{ReplicaAlreadyExistsException}}. This kind of inconsistency does not happen in real clusters, but even if it did, it would be fixed when the replication times out. The test is set to timeout before the default replication timeout, so it didn't have any chance to do that. > DataStreamer: keep sending heartbeat packets while streaming > > > Key: HDFS-15813 > URL: https://issues.apache.org/jira/browse/HDFS-15813 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch, > HDFS-15813.003.patch, HDFS-15813.004.patch > > > In response to [HDFS-5032], [~daryn] made a change to our internal code to > ensure that heartbeats continue during data steaming, even in the face of a > slow disk. > As [~kihwal] noted, absence of heartbeat during flush will be fixed in a > separate jira. It doesn't look like this change was ever pushed back to > apache, so I am providing it here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming
[ https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279213#comment-17279213 ] Hadoop QA commented on HDFS-15813: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 50s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 33s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 39s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 25s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 48s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 13s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 11s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 37s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 15s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 42s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 1s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 6s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 6s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 42s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 42s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 9s{color} | {color:green}{color} | {color:green} hadoop-hdfs-project: The patch generated 0 new + 86 unchanged - 1 fixed = 86 total (was 87) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 2s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 55s{color} | {color:green}{color} | {color:green} patch
[jira] [Created] (HDFS-15821) Add metrics for in-service datanodes
Zehao Chen created HDFS-15821: - Summary: Add metrics for in-service datanodes Key: HDFS-15821 URL: https://issues.apache.org/jira/browse/HDFS-15821 Project: Hadoop HDFS Issue Type: Improvement Reporter: Zehao Chen Assignee: Zehao Chen We currently have metrics for live datanodes but some of the datanodes may be in decommissioning state or maintenance state. Adding this metric allows us to know how many nodes are currently in service, where NumInServiceDatanodes = NumLiveDataNodes - NumDecomLiveDataNodes - NumInMaintenanceLiveDataNodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15757) RBF: Improving Router Connection Management
[ https://issues.apache.org/jira/browse/HDFS-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279190#comment-17279190 ] Fengnan Li commented on HDFS-15757: --- [~hexiaoqiao] Thanks for the question. There are three latencies: 1. rpc queue time, this is measured the time in the RPC queue, which is not related with the change. 2. rpc processing time, this is measured before the actual proxy op (get tcp connection and talk to nn), which is not related. 3. proxy time, this is directly impacted since the change improves getConnection() a lot. I have done some flamegraphes for Router to understand the performance bottleneck and often I can see getConnection() in the stack taking a lot of time. With this change, connections are actually maintained as Active as possible. v.s. previously the connection left not quite closed and hitting the connection cap for the pool thus no more active connection can be created. >From the last graph I included in the doc we can see the ProxyTime is improved. Feel free to give a try in your setup. It's always good to have a second eye on it. > RBF: Improving Router Connection Management > --- > > Key: HDFS-15757 > URL: https://issues.apache.org/jira/browse/HDFS-15757 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Major > Labels: pull-request-available > Attachments: RBF_ Improving Router Connection Management_v2.pdf, RBF_ > Improving Router Connection Management_v3.pdf, RBF_ Router Connection > Management.pdf > > Time Spent: 3h 10m > Remaining Estimate: 0h > > We have seen high number of connections from Router to namenodes, leaving > namenodes unstable. > This ticket is trying to reduce connections through some changes. Please take > a look at the design and leave comments. > Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15817) Rename snapshots while marking them deleted
[ https://issues.apache.org/jira/browse/HDFS-15817?focusedWorklogId=547906=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547906 ] ASF GitHub Bot logged work on HDFS-15817: - Author: ASF GitHub Bot Created on: 04/Feb/21 21:41 Start Date: 04/Feb/21 21:41 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2677: URL: https://github.com/apache/hadoop/pull/2677#issuecomment-773621670 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 43s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 89m 46s | | trunk passed | | +1 :green_heart: | compile | 1m 26s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 30s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | checkstyle | 1m 20s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 39s | | trunk passed | | +1 :green_heart: | shadedclient | 19m 58s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 8s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 45s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +0 :ok: | spotbugs | 3m 54s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 50s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 30s | | the patch passed | | +1 :green_heart: | compile | 1m 32s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 32s | | the patch passed | | +1 :green_heart: | compile | 1m 24s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | javac | 1m 24s | | the patch passed | | +1 :green_heart: | checkstyle | 1m 5s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 27s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 17m 47s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 59s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 34s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | findbugs | 4m 9s | | the patch passed | _ Other Tests _ | | -1 :x: | unit | 209m 30s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2677/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 45s | | The patch does not generate ASF License warnings. | | | | 366m 46s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.tools.TestViewFileSystemOverloadSchemeWithDFSAdmin | | | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover | | | hadoop.hdfs.server.namenode.snapshot.TestOrderedSnapshotDeletionGc | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2677/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2677 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 8829bc4f5317 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 5f34271bb14 | | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private
[jira] [Work logged] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit
[ https://issues.apache.org/jira/browse/HDFS-15820?focusedWorklogId=547900=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547900 ] ASF GitHub Bot logged work on HDFS-15820: - Author: ASF GitHub Bot Created on: 04/Feb/21 21:37 Start Date: 04/Feb/21 21:37 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2682: URL: https://github.com/apache/hadoop/pull/2682#issuecomment-773619671 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 4s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 93m 13s | | trunk passed | | +1 :green_heart: | compile | 1m 21s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 10s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | checkstyle | 1m 4s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 21s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 57s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 53s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 23s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +0 :ok: | spotbugs | 3m 16s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 13s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 12s | | the patch passed | | +1 :green_heart: | compile | 1m 16s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 16s | | the patch passed | | +1 :green_heart: | compile | 1m 6s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | javac | 1m 6s | | the patch passed | | -0 :warning: | checkstyle | 0m 57s | [/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2682/1/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 299 unchanged - 0 fixed = 302 total (was 299) | | +1 :green_heart: | mvnsite | 1m 11s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 15m 7s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 49s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 22s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | findbugs | 3m 15s | | the patch passed | _ Other Tests _ | | -1 :x: | unit | 221m 3s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2682/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 36s | | The patch does not generate ASF License warnings. | | | | 372m 2s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport | | | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2682/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2682 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux aa60dff82e51 4.15.0-126-generic #129-Ubuntu SMP Mon Nov 23 18:53:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 5f34271bb14 | | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | Multi-JDK versions |
[jira] [Updated] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit
[ https://issues.apache.org/jira/browse/HDFS-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyao Meng updated HDFS-15820: -- Status: Patch Available (was: Open) > Ensure snapshot root trash provisioning happens only post safe mode exit > > > Key: HDFS-15820 > URL: https://issues.apache.org/jira/browse/HDFS-15820 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Currently, on namenode startup, snapshot trash root provisioning starts as > along with trash emptier service but namenode might not be out of safe mode > by then. This can fail the snapshot trash dir creation thereby crashing the > namenode. The idea here is to trigger snapshot trash provisioning only post > safe mode exit. > {code:java} > 2021-02-04 11:23:47,323 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring > NN shutdown. Shutting down immediately. > org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create > directory /upgrade/.Trash. Name node is in safe mode. > The reported blocks 0 needs additional 1383 blocks to reach the threshold > 0.9990 of total blocks 1385. > The number of live datanodes 0 needs an additional 1 live datanodes to reach > the minimum number 1. > Safe mode will be turned off automatically once the thresholds have been > reached. NamenodeHostName:quasar-brabeg-5.quasar-brabeg.root.hwx.site > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1542) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3288) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAndProvisionSnapshotTrashRoots(FSNamesystem.java:8269) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1939) > at > org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:967) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:936) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1740) > 2021-02-04 11:23:47,334 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot > create directory /upgrade/.Trash. Name node is in safe mode. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit
[ https://issues.apache.org/jira/browse/HDFS-15820?focusedWorklogId=547784=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547784 ] ASF GitHub Bot logged work on HDFS-15820: - Author: ASF GitHub Bot Created on: 04/Feb/21 19:02 Start Date: 04/Feb/21 19:02 Worklog Time Spent: 10m Work Description: smengcl commented on a change in pull request #2682: URL: https://github.com/apache/hadoop/pull/2682#discussion_r570461526 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java ## @@ -8531,25 +8527,37 @@ void checkAccess(String src, FsAction mode) throws IOException { * Check if snapshot roots are created for all existing snapshottable * directories. Create them if not. */ - void checkAndProvisionSnapshotTrashRoots() throws IOException { -SnapshottableDirectoryStatus[] dirStatusList = getSnapshottableDirListing(); -if (dirStatusList == null) { - return; -} -for (SnapshottableDirectoryStatus dirStatus : dirStatusList) { - String currDir = dirStatus.getFullPath().toString(); - if (!currDir.endsWith(Path.SEPARATOR)) { -currDir += Path.SEPARATOR; - } - String trashPath = currDir + FileSystem.TRASH_PREFIX; - HdfsFileStatus fileStatus = getFileInfo(trashPath, false, false, false); - if (fileStatus == null) { -LOG.info("Trash doesn't exist for snapshottable directory {}. " -+ "Creating trash at {}", currDir, trashPath); -PermissionStatus permissionStatus = new PermissionStatus(getRemoteUser() -.getShortUserName(), null, SHARED_TRASH_PERMISSION); -mkdirs(trashPath, permissionStatus, false); + @Override + public void checkAndProvisionSnapshotTrashRoots() { +if (isSnapshotTrashRootEnabled) { + try { +SnapshottableDirectoryStatus[] dirStatusList = +getSnapshottableDirListing(); +if (dirStatusList == null) { + return; +} +for (SnapshottableDirectoryStatus dirStatus : dirStatusList) { + String currDir = dirStatus.getFullPath().toString(); + if (!currDir.endsWith(Path.SEPARATOR)) { +currDir += Path.SEPARATOR; + } + String trashPath = currDir + FileSystem.TRASH_PREFIX; + HdfsFileStatus fileStatus = getFileInfo(trashPath, false, false, false); + if (fileStatus == null) { +LOG.info("Trash doesn't exist for snapshottable directory {}. " + "Creating trash at {}", currDir, trashPath); +PermissionStatus permissionStatus = +new PermissionStatus(getRemoteUser().getShortUserName(), null, +SHARED_TRASH_PERMISSION); +mkdirs(trashPath, permissionStatus, false); + } +} + } catch (IOException e) { +final String msg = +"Could not provision Trash directory for existing " ++ "snapshottable directories. Exiting Namenode."; +ExitUtil.terminate(1, msg); Review comment: Pro: Terminating NN in this case is a sure good way of uncovering an unexpected problems instead of hiding it in the logs. Con: I wonder if we really should terminate NN when Trash directory fails to be deployed. We could just throw a warning message? Either way, I'm fine with both. Just a thought. ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java ## @@ -2524,7 +2524,7 @@ public void testNameNodeCreateSnapshotTrashRootOnStartup() MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build(); try { - final DistributedFileSystem dfs = cluster.getFileSystem(); + DistributedFileSystem dfs = cluster.getFileSystem(); Review comment: nit: add one more space before this line for alignment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 547784) Time Spent: 20m (was: 10m) > Ensure snapshot root trash provisioning happens only post safe mode exit > > > Key: HDFS-15820 > URL: https://issues.apache.org/jira/browse/HDFS-15820 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h
[jira] [Commented] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit
[ https://issues.apache.org/jira/browse/HDFS-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279050#comment-17279050 ] Siyao Meng commented on HDFS-15820: --- [~shashikant] will do. > Ensure snapshot root trash provisioning happens only post safe mode exit > > > Key: HDFS-15820 > URL: https://issues.apache.org/jira/browse/HDFS-15820 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, on namenode startup, snapshot trash root provisioning starts as > along with trash emptier service but namenode might not be out of safe mode > by then. This can fail the snapshot trash dir creation thereby crashing the > namenode. The idea here is to trigger snapshot trash provisioning only post > safe mode exit. > {code:java} > 2021-02-04 11:23:47,323 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring > NN shutdown. Shutting down immediately. > org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create > directory /upgrade/.Trash. Name node is in safe mode. > The reported blocks 0 needs additional 1383 blocks to reach the threshold > 0.9990 of total blocks 1385. > The number of live datanodes 0 needs an additional 1 live datanodes to reach > the minimum number 1. > Safe mode will be turned off automatically once the thresholds have been > reached. NamenodeHostName:quasar-brabeg-5.quasar-brabeg.root.hwx.site > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1542) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3288) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAndProvisionSnapshotTrashRoots(FSNamesystem.java:8269) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1939) > at > org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:967) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:936) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1740) > 2021-02-04 11:23:47,334 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot > create directory /upgrade/.Trash. Name node is in safe mode. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming
[ https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278973#comment-17278973 ] Jim Brennan commented on HDFS-15813: patch 004 fixes the checkstyle issue. I don't think the unit test failures are related. I was not able to reproduce them locally. > DataStreamer: keep sending heartbeat packets while streaming > > > Key: HDFS-15813 > URL: https://issues.apache.org/jira/browse/HDFS-15813 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch, > HDFS-15813.003.patch, HDFS-15813.004.patch > > > In response to [HDFS-5032], [~daryn] made a change to our internal code to > ensure that heartbeats continue during data steaming, even in the face of a > slow disk. > As [~kihwal] noted, absence of heartbeat during flush will be fixed in a > separate jira. It doesn't look like this change was ever pushed back to > apache, so I am providing it here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming
[ https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated HDFS-15813: --- Attachment: HDFS-15813.004.patch > DataStreamer: keep sending heartbeat packets while streaming > > > Key: HDFS-15813 > URL: https://issues.apache.org/jira/browse/HDFS-15813 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch, > HDFS-15813.003.patch, HDFS-15813.004.patch > > > In response to [HDFS-5032], [~daryn] made a change to our internal code to > ensure that heartbeats continue during data steaming, even in the face of a > slow disk. > As [~kihwal] noted, absence of heartbeat during flush will be fixed in a > separate jira. It doesn't look like this change was ever pushed back to > apache, so I am providing it here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit
[ https://issues.apache.org/jira/browse/HDFS-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278911#comment-17278911 ] Shashikant Banerjee commented on HDFS-15820: [~smeng], can you help review this? > Ensure snapshot root trash provisioning happens only post safe mode exit > > > Key: HDFS-15820 > URL: https://issues.apache.org/jira/browse/HDFS-15820 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, on namenode startup, snapshot trash root provisioning starts as > along with trash emptier service but namenode might not be out of safe mode > by then. This can fail the snapshot trash dir creation thereby crashing the > namenode. The idea here is to trigger snapshot trash provisioning only post > safe mode exit. > {code:java} > 2021-02-04 11:23:47,323 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring > NN shutdown. Shutting down immediately. > org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create > directory /upgrade/.Trash. Name node is in safe mode. > The reported blocks 0 needs additional 1383 blocks to reach the threshold > 0.9990 of total blocks 1385. > The number of live datanodes 0 needs an additional 1 live datanodes to reach > the minimum number 1. > Safe mode will be turned off automatically once the thresholds have been > reached. NamenodeHostName:quasar-brabeg-5.quasar-brabeg.root.hwx.site > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1542) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3288) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAndProvisionSnapshotTrashRoots(FSNamesystem.java:8269) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1939) > at > org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:967) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:936) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1740) > 2021-02-04 11:23:47,334 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot > create directory /upgrade/.Trash. Name node is in safe mode. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15817) Rename snapshots while marking them deleted
[ https://issues.apache.org/jira/browse/HDFS-15817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278909#comment-17278909 ] Shashikant Banerjee commented on HDFS-15817: [~szetszwo], can you please help review this? > Rename snapshots while marking them deleted > > > Key: HDFS-15817 > URL: https://issues.apache.org/jira/browse/HDFS-15817 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > With ordered snapshot feature turned on, a snapshot will be just marked as > deleted but won't actually be deleted if its not the oldest one. Since, the > snapshot is just marked deleted, creation of new snapshot having the same > name as the one which was marked deleted will fail. In order to mitigate such > problems, the idea here is to rename the snapshot getting marked as deleted > by appending deletion timestamp along with snapshot id to it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit
[ https://issues.apache.org/jira/browse/HDFS-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee updated HDFS-15820: --- Description: Currently, on namenode startup, snapshot trash root provisioning starts as along with trash emptier service but namenode might not be out of safe mode by then. This can fail the snapshot trash dir creation thereby crashing the namenode. The idea here is to trigger snapshot trash provisioning only post safe mode exit. {code:java} 2021-02-04 11:23:47,323 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring NN shutdown. Shutting down immediately. org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /upgrade/.Trash. Name node is in safe mode. The reported blocks 0 needs additional 1383 blocks to reach the threshold 0.9990 of total blocks 1385. The number of live datanodes 0 needs an additional 1 live datanodes to reach the minimum number 1. Safe mode will be turned off automatically once the thresholds have been reached. NamenodeHostName:quasar-brabeg-5.quasar-brabeg.root.hwx.site at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1542) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3288) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAndProvisionSnapshotTrashRoots(FSNamesystem.java:8269) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1939) at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:967) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:936) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1740) 2021-02-04 11:23:47,334 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /upgrade/.Trash. Name node is in safe mode. {code} was:Currently, on namenode startup, snapshot trash root provisioning starts as along with trash emptier service but namenode might not be out of safe mode by then. This can fail the snapshot trash dir creation thereby crashing the namenode. The idea here is to trigger snapshot trash provisioning only post safe mode exit. > Ensure snapshot root trash provisioning happens only post safe mode exit > > > Key: HDFS-15820 > URL: https://issues.apache.org/jira/browse/HDFS-15820 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, on namenode startup, snapshot trash root provisioning starts as > along with trash emptier service but namenode might not be out of safe mode > by then. This can fail the snapshot trash dir creation thereby crashing the > namenode. The idea here is to trigger snapshot trash provisioning only post > safe mode exit. > {code:java} > 2021-02-04 11:23:47,323 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring > NN shutdown. Shutting down immediately. > org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create > directory /upgrade/.Trash. Name node is in safe mode. > The reported blocks 0 needs additional 1383 blocks to reach the threshold > 0.9990 of total blocks 1385. > The number of live datanodes 0 needs an additional 1 live datanodes to reach > the minimum number 1. > Safe mode will be turned off automatically once the thresholds have been > reached. NamenodeHostName:quasar-brabeg-5.quasar-brabeg.root.hwx.site > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1542) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3288) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAndProvisionSnapshotTrashRoots(FSNamesystem.java:8269) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1939) > at > org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61) > at >
[jira] [Updated] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit
[ https://issues.apache.org/jira/browse/HDFS-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-15820: -- Labels: pull-request-available (was: ) > Ensure snapshot root trash provisioning happens only post safe mode exit > > > Key: HDFS-15820 > URL: https://issues.apache.org/jira/browse/HDFS-15820 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, on namenode startup, snapshot trash root provisioning starts as > along with trash emptier service but namenode might not be out of safe mode > by then. This can fail the snapshot trash dir creation thereby crashing the > namenode. The idea here is to trigger snapshot trash provisioning only post > safe mode exit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit
[ https://issues.apache.org/jira/browse/HDFS-15820?focusedWorklogId=547686=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547686 ] ASF GitHub Bot logged work on HDFS-15820: - Author: ASF GitHub Bot Created on: 04/Feb/21 15:24 Start Date: 04/Feb/21 15:24 Worklog Time Spent: 10m Work Description: bshashikant opened a new pull request #2682: URL: https://github.com/apache/hadoop/pull/2682 Please see https://issues.apache.org/jira/browse/HDFS-15820. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 547686) Remaining Estimate: 0h Time Spent: 10m > Ensure snapshot root trash provisioning happens only post safe mode exit > > > Key: HDFS-15820 > URL: https://issues.apache.org/jira/browse/HDFS-15820 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently, on namenode startup, snapshot trash root provisioning starts as > along with trash emptier service but namenode might not be out of safe mode > by then. This can fail the snapshot trash dir creation thereby crashing the > namenode. The idea here is to trigger snapshot trash provisioning only post > safe mode exit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit
Shashikant Banerjee created HDFS-15820: -- Summary: Ensure snapshot root trash provisioning happens only post safe mode exit Key: HDFS-15820 URL: https://issues.apache.org/jira/browse/HDFS-15820 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Currently, on namenode startup, snapshot trash root provisioning starts as along with trash emptier service but namenode might not be out of safe mode by then. This can fail the snapshot trash dir creation thereby crashing the namenode. The idea here is to trigger snapshot trash provisioning only post safe mode exit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15790) Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist
[ https://issues.apache.org/jira/browse/HDFS-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278864#comment-17278864 ] David Mollitor commented on HDFS-15790: --- OK. This looks OK with me. As I said, in my original issue, both engines were loaded into the same JVM and they would both fight at the point of registration. It looks like things are now setup that they both register in the same static way and they don't explode when they both register. Thanks. > Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist > -- > > Key: HDFS-15790 > URL: https://issues.apache.org/jira/browse/HDFS-15790 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Changing from Protobuf 2 to Protobuf 3 broke some stuff in Apache Hive > project. This was not an awesome thing to do between minor versions in > regards to backwards compatibility for downstream projects. > Additionally, these two frameworks are not drop-in replacements, they have > some differences. Also, Protobuf 2 is not deprecated or anything so let us > have both protocols available at the same time. In Hadoop 4.x Protobuf 2 > support can be dropped. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15792) ClasscastException while loading FSImage
[ https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278814#comment-17278814 ] Renukaprasad C commented on HDFS-15792: --- Thank [~hexiaoqiao] Even I am not happy to write such code. This is some workaround i have done to compile on JDK 8. Otherwise this syntax is not needed if we use JDK 7. >From the some of the logs generated above i noticed env uses Java 8. As of now >this is used only in Test, still do we need to focus on this checkstyle fix? Or Do you have any alternative approach to address this compile issue ? > ClasscastException while loading FSImage > > > Key: HDFS-15792 > URL: https://issues.apache.org/jira/browse/HDFS-15792 > Project: Hadoop HDFS > Issue Type: Bug > Components: nn >Reporter: Renukaprasad C >Assignee: Renukaprasad C >Priority: Major > Fix For: 3.3.1, 3.4.0 > > Attachments: HDFS-15792-branch-2.10.001.patch, > HDFS-15792-branch-2.10.002.patch, HDFS-15792.001.patch, HDFS-15792.002.patch, > HDFS-15792.003.patch, HDFS-15792.004.patch, HDFS-15792.005.patch, > HDFS-15792.addendum.001.patch, image-2021-01-27-12-00-34-846.png > > > FSImage loading has failed with ClasscastException - > java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to > java.util.HashMap$TreeNode. > This is the usage issue with Hashmap in concurrent scenarios. > Same issue has been reported on Java & closed as usage issue. - > https://bugs.openjdk.java.net/browse/JDK-8173671 > 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading > INODE from fsiamge. | FSImageFormatProtobuf.java:442 > java.lang. > : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode > at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835) > at java.util.HashMap$TreeNode.treeify(HashMap.java:1951) > at java.util.HashMap.treeifyBin(HashMap.java:772) > at java.util.HashMap.putVal(HashMap.java:644) > at java.util.HashMap.put(HashMap.java:612) > at > org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53) > at > org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391) > at > org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from > FSImageFile(file=/srv/BigData/namenode/current/fsimage_00198227480, > cpktTxId=00198227480) | FSImage.java:738 > java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node > cannot be cast to java.util.HashMap$TreeNode > at > org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:820) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:733) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1113) > at >
[jira] [Work logged] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-15819?focusedWorklogId=547603=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547603 ] ASF GitHub Bot logged work on HDFS-15819: - Author: ASF GitHub Bot Created on: 04/Feb/21 11:45 Start Date: 04/Feb/21 11:45 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2681: URL: https://github.com/apache/hadoop/pull/2681#issuecomment-773247145 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 9s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 22s | | trunk passed | | +1 :green_heart: | compile | 1m 20s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 15s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | checkstyle | 0m 59s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 21s | | trunk passed | | +1 :green_heart: | shadedclient | 15m 28s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 56s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 29s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +0 :ok: | spotbugs | 3m 6s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 4s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 10s | | the patch passed | | +1 :green_heart: | compile | 1m 15s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 15s | | the patch passed | | +1 :green_heart: | compile | 1m 5s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | javac | 1m 5s | | the patch passed | | +1 :green_heart: | checkstyle | 0m 52s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 13s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 12m 36s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 47s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 22s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | findbugs | 3m 8s | | the patch passed | _ Other Tests _ | | -1 :x: | unit | 227m 44s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2681/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 45s | | The patch does not generate ASF License warnings. | | | | 313m 9s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeUUID | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2681/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2681 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c6784034c641 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 15a1f7adfc0 | | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2681/1/testReport/ | | Max. process+thread count | 3062 (vs. ulimit of 5500) | | modules | C:
[jira] [Work logged] (HDFS-15761) Dead NORMAL DN shouldn't transit to DECOMMISSIONED immediately
[ https://issues.apache.org/jira/browse/HDFS-15761?focusedWorklogId=547526=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547526 ] ASF GitHub Bot logged work on HDFS-15761: - Author: ASF GitHub Bot Created on: 04/Feb/21 09:47 Start Date: 04/Feb/21 09:47 Worklog Time Spent: 10m Work Description: tasanuma commented on a change in pull request #2588: URL: https://github.com/apache/hadoop/pull/2588#discussion_r570072022 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java ## @@ -383,30 +383,70 @@ public void testDecommissionStatusAfterDNRestart() throws Exception { /** * Verify the support for decommissioning a datanode that is already dead. - * Under this scenario the datanode should immediately be marked as - * DECOMMISSIONED + * Under this scenario the datanode should be marked as + * DECOMMISSION_IN_PROGRESS first. When pendingReplicationBlocksCount and + * underReplicatedBlocksCount are both 0, it becomes DECOMMISSIONED. */ @Test(timeout=12) public void testDecommissionDeadDN() throws Exception { Logger log = Logger.getLogger(DatanodeAdminManager.class); log.setLevel(Level.DEBUG); -DatanodeID dnID = cluster.getDataNodes().get(0).getDatanodeId(); -String dnName = dnID.getXferAddr(); -DataNodeProperties stoppedDN = cluster.stopDataNode(0); -DFSTestUtil.waitForDatanodeState(cluster, dnID.getDatanodeUuid(), -false, 3); + +DistributedFileSystem fileSystem = cluster.getFileSystem(); + +// Create a file with one block. That block has one replica. +Path f = new Path("decommission.dat"); +DFSTestUtil.createFile(fileSystem, f, fileSize, fileSize, fileSize, +(short)1, seed); + +// Find the DN that owns the only replica. +RemoteIterator fileList = +fileSystem.listLocatedStatus(f); +BlockLocation[] blockLocations = fileList.next().getBlockLocations(); +String[] dnNames = blockLocations[0].getNames(); Review comment: As the target DN is one host, we may not need to use String array and for-loop. ```java String dnName = blockLocations[0].getNames()[0]; ``` ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java ## @@ -383,30 +383,70 @@ public void testDecommissionStatusAfterDNRestart() throws Exception { /** * Verify the support for decommissioning a datanode that is already dead. - * Under this scenario the datanode should immediately be marked as - * DECOMMISSIONED + * Under this scenario the datanode should be marked as + * DECOMMISSION_IN_PROGRESS first. When pendingReplicationBlocksCount and + * underReplicatedBlocksCount are both 0, it becomes DECOMMISSIONED. */ @Test(timeout=12) public void testDecommissionDeadDN() throws Exception { Logger log = Logger.getLogger(DatanodeAdminManager.class); log.setLevel(Level.DEBUG); -DatanodeID dnID = cluster.getDataNodes().get(0).getDatanodeId(); -String dnName = dnID.getXferAddr(); -DataNodeProperties stoppedDN = cluster.stopDataNode(0); -DFSTestUtil.waitForDatanodeState(cluster, dnID.getDatanodeUuid(), -false, 3); + +DistributedFileSystem fileSystem = cluster.getFileSystem(); + +// Create a file with one block. That block has one replica. +Path f = new Path("decommission.dat"); +DFSTestUtil.createFile(fileSystem, f, fileSize, fileSize, fileSize, +(short)1, seed); + +// Find the DN that owns the only replica. +RemoteIterator fileList = +fileSystem.listLocatedStatus(f); +BlockLocation[] blockLocations = fileList.next().getBlockLocations(); +String[] dnNames = blockLocations[0].getNames(); + +// Stop the DN leads to 1 block under-replicated +DataNodeProperties[] stoppedDNs = new DataNodeProperties[dnNames.length]; +for (int i = 0; i < dnNames.length; i++) { + stoppedDNs[i] = cluster.stopDataNode(dnNames[i]); +} + FSNamesystem fsn = cluster.getNamesystem(); final DatanodeManager dm = fsn.getBlockManager().getDatanodeManager(); -DatanodeDescriptor dnDescriptor = dm.getDatanode(dnID); -decommissionNode(dnName); +final List dead = new ArrayList(); +while (true) { + dm.fetchDatanodes(null, dead, false); + if (dead.size() == 3) { Review comment: Why waiting for `dead.size()==3`? They all seem to be the same host. And it would be better to use `GenericTestUtils.waitFor` instead of using the `while(true)` loop. ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java ## @@ -453,10 +493,10 @@ public void testDecommissionLosingData() throws Exception {
[jira] [Commented] (HDFS-15812) after deleting data of hbase table hdfs size is not decreasing
[ https://issues.apache.org/jira/browse/HDFS-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278715#comment-17278715 ] Satya Gaurav commented on HDFS-15812: - the value of hbase.master.hfilecleaner.ttl= 360 ms i.e 1hr > after deleting data of hbase table hdfs size is not decreasing > -- > > Key: HDFS-15812 > URL: https://issues.apache.org/jira/browse/HDFS-15812 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.0.2-alpha > Environment: HDP 3.1.4.0-315 > Hbase 2.0.2.3.1.4.0-315 >Reporter: Satya Gaurav >Priority: Major > > I am deleting the data from hbase table, it's deleting from hbase table but > the size of the hdfs directory is not reducing. Even I ran the major > compaction but after that also hdfs size didn't reduce. Any solution for this > issue? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15818) Fix TestFsDatasetImpl.testReadLockCanBeDisabledByConfig
[ https://issues.apache.org/jira/browse/HDFS-15818?focusedWorklogId=547500=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547500 ] ASF GitHub Bot logged work on HDFS-15818: - Author: ASF GitHub Bot Created on: 04/Feb/21 09:05 Start Date: 04/Feb/21 09:05 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2679: URL: https://github.com/apache/hadoop/pull/2679#issuecomment-773148937 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 33s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 34m 12s | | trunk passed | | +1 :green_heart: | compile | 1m 20s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 13s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | checkstyle | 0m 59s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 23s | | trunk passed | | +1 :green_heart: | shadedclient | 16m 5s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 54s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 32s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +0 :ok: | spotbugs | 3m 19s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 15s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 15s | | the patch passed | | +1 :green_heart: | compile | 1m 20s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 20s | | the patch passed | | +1 :green_heart: | compile | 1m 8s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | javac | 1m 8s | | the patch passed | | -0 :warning: | checkstyle | 0m 56s | [/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2679/1/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 41 unchanged - 0 fixed = 43 total (was 41) | | +1 :green_heart: | mvnsite | 1m 16s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 13m 58s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 48s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 21s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | findbugs | 3m 7s | | the patch passed | _ Other Tests _ | | -1 :x: | unit | 195m 42s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2679/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 42s | | The patch does not generate ASF License warnings. | | | | 284m 43s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2679/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2679 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4af8bbd0fc69 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 15a1f7adfc0 | | Default Java | Private
[jira] [Commented] (HDFS-15757) RBF: Improving Router Connection Management
[ https://issues.apache.org/jira/browse/HDFS-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278649#comment-17278649 ] Xiaoqiao He commented on HDFS-15757: Thanks [~fengnanli] for your works and update. The new patch looks good to me. One concerned that if it will involve extra cost for end-to-end latency. For the original implement it has chance to multiplex connection for the same request (maybe it is a low-probability) due to postpone recycle. After changes the reused probability will be reduced IMO. And setup new connection is more cost operation compare to reused. So do you observe end-to-end latency cost. Please correct me if something wrong. Thanks. > RBF: Improving Router Connection Management > --- > > Key: HDFS-15757 > URL: https://issues.apache.org/jira/browse/HDFS-15757 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Major > Labels: pull-request-available > Attachments: RBF_ Improving Router Connection Management_v2.pdf, RBF_ > Improving Router Connection Management_v3.pdf, RBF_ Router Connection > Management.pdf > > Time Spent: 3h 10m > Remaining Estimate: 0h > > We have seen high number of connections from Router to namenodes, leaving > namenodes unstable. > This ticket is trying to reduce connections through some changes. Please take > a look at the design and leave comments. > Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15792) ClasscastException while loading FSImage
[ https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278633#comment-17278633 ] Xiaoqiao He commented on HDFS-15792: [~prasad-acit] checkstyle issue reported by Yetus, would you mind to fix it? BTW, IMO it is not necessary to do type conversion here. FYI. Thanks. {code:java} public ImmutableList getEntries() { -return new ImmutableList.Builder().addAll(referenceMap.keySet()).build(); +return new ImmutableList.Builder().addAll((new java.util.HashMap<> +(referenceMap)).keySet()).build(); } {code} > ClasscastException while loading FSImage > > > Key: HDFS-15792 > URL: https://issues.apache.org/jira/browse/HDFS-15792 > Project: Hadoop HDFS > Issue Type: Bug > Components: nn >Reporter: Renukaprasad C >Assignee: Renukaprasad C >Priority: Major > Fix For: 3.3.1, 3.4.0 > > Attachments: HDFS-15792-branch-2.10.001.patch, > HDFS-15792-branch-2.10.002.patch, HDFS-15792.001.patch, HDFS-15792.002.patch, > HDFS-15792.003.patch, HDFS-15792.004.patch, HDFS-15792.005.patch, > HDFS-15792.addendum.001.patch, image-2021-01-27-12-00-34-846.png > > > FSImage loading has failed with ClasscastException - > java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to > java.util.HashMap$TreeNode. > This is the usage issue with Hashmap in concurrent scenarios. > Same issue has been reported on Java & closed as usage issue. - > https://bugs.openjdk.java.net/browse/JDK-8173671 > 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading > INODE from fsiamge. | FSImageFormatProtobuf.java:442 > java.lang. > : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode > at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835) > at java.util.HashMap$TreeNode.treeify(HashMap.java:1951) > at java.util.HashMap.treeifyBin(HashMap.java:772) > at java.util.HashMap.putVal(HashMap.java:644) > at java.util.HashMap.put(HashMap.java:612) > at > org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53) > at > org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391) > at > org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from > FSImageFile(file=/srv/BigData/namenode/current/fsimage_00198227480, > cpktTxId=00198227480) | FSImage.java:738 > java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node > cannot be cast to java.util.HashMap$TreeNode > at > org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:820) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:733) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1113) > at >