[jira] [Work logged] (HDFS-15683) Allow configuring DISK/ARCHIVE capacity for individual volumes

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15683?focusedWorklogId=548031=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-548031
 ]

ASF GitHub Bot logged work on HDFS-15683:
-

Author: ASF GitHub Bot
Created on: 05/Feb/21 04:25
Start Date: 05/Feb/21 04:25
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2625:
URL: https://github.com/apache/hadoop/pull/2625#issuecomment-773778344


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  14m 26s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m  1s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  checkstyle  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m 51s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +0 :ok: |  spotbugs  |   3m  5s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m  3s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | -1 :x: |  javac  |   1m 11s | 
[/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2625/8/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 
with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 14 new + 580 
unchanged - 14 fixed = 594 total (was 594)  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 820 unchanged - 1 
fixed = 820 total (was 821)  |
   | +1 :green_heart: |  mvnsite  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  12m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m  0s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 192m 10s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2625/8/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 41s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 291m  0s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.hdfs.TestDFSClientExcludedNodes |
   |   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2625/8/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2625 |
   | Optional Tests | dupname asflicense compile javac javadoc 

[jira] [Commented] (HDFS-15684) EC: Call recoverLease on DFSStripedOutputStream close exception

2021-02-04 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279306#comment-17279306
 ] 

Hui Fei commented on HDFS-15684:


[~wanghongbing] cherry-pick failed, could you please give patches on branch 
3.3, 3.2, 3.1? Thanks

> EC: Call recoverLease on DFSStripedOutputStream close exception
> ---
>
> Key: HDFS-15684
> URL: https://issues.apache.org/jira/browse/HDFS-15684
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, ec
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15684.001.patch, HDFS-15684.002.patch, 
> HDFS-15684.003.patch
>
>
> -HDFS-14694- add a feature that call recoverLease operation automatically 
> when DFSOutputSteam close encounters exception. When we wanted to apply this 
> feature to our cluster, we found that it does not support EC files. 
> I think this feature should take effect whether replica files or EC files. 
> This Jira proposes to make it effective when in the case of EC files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15798) EC: Reconstruct task failed, and It would be XmitsInProgress of DN has negative number

2021-02-04 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279300#comment-17279300
 ] 

Hui Fei commented on HDFS-15798:


Commit to trunk , and cherry-pick to branch-3.3, 3.2,3.1

[~haiyang Hu] [~sodonnell] Thanks again.

> EC: Reconstruct task failed, and It would be XmitsInProgress of DN has 
> negative number
> --
>
> Key: HDFS-15798
> URL: https://issues.apache.org/jira/browse/HDFS-15798
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: HDFS-15798.001.patch, HDFS-15798.002.patch, 
> HDFS-15798.003.patch
>
>
> The EC reconstruct task failed, and the decrementXmitsInProgress of 
> processErasureCodingTasks operation abnormal value ;
>  It would be XmitsInProgress of DN has negative number, it affects NN chooses 
> pending tasks based on the ratio between the lengths of replication and 
> erasure-coded block queues.
> {code:java}
> // 1.ErasureCodingWorker.java
> public void processErasureCodingTasks(
> Collection ecTasks) {
>   for (BlockECReconstructionInfo reconInfo : ecTasks) {
> int xmitsSubmitted = 0;
> try {
>   ...
>   // It may throw IllegalArgumentException from task#stripedReader
>   // constructor.
>   final StripedBlockReconstructor task =
>   new StripedBlockReconstructor(this, stripedReconInfo);
>   if (task.hasValidTargets()) {
> // See HDFS-12044. We increase xmitsInProgress even the task is only
> // enqueued, so that
> //   1) NN will not send more tasks than what DN can execute and
> //   2) DN will not throw away reconstruction tasks, and instead keeps
> //  an unbounded number of tasks in the executor's task queue.
> xmitsSubmitted = Math.max((int)(task.getXmits() * xmitWeight), 1);
> getDatanode().incrementXmitsInProcess(xmitsSubmitted); //  task start 
> increment
> stripedReconstructionPool.submit(task);
>   } else {
> LOG.warn("No missing internal block. Skip reconstruction for task:{}",
> reconInfo);
>   }
> } catch (Throwable e) {
>   getDatanode().decrementXmitsInProgress(xmitsSubmitted); //  task failed 
> decrement,  XmitsInProgress is decremented by the previous value
>   LOG.warn("Failed to reconstruct striped block {}",
>   reconInfo.getExtendedBlock().getLocalBlock(), e);
> }
>   }
> }
> // 2.StripedBlockReconstructor.java
> public void run() {
>   try {
> initDecoderIfNecessary();
>...
>   } catch (Throwable e) {
> LOG.warn("Failed to reconstruct striped block: {}", getBlockGroup(), e);
> getDatanode().getMetrics().incrECFailedReconstructionTasks();
>   } finally {
> float xmitWeight = getErasureCodingWorker().getXmitWeight();
> // if the xmits is smaller than 1, the xmitsSubmitted should be set to 1
> // because if it set to zero, we cannot to measure the xmits submitted
> int xmitsSubmitted = Math.max((int) (getXmits() * xmitWeight), 1);
> getDatanode().decrementXmitsInProgress(xmitsSubmitted); // task complete 
> decrement
> ...
>   }
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15798) EC: Reconstruct task failed, and It would be XmitsInProgress of DN has negative number

2021-02-04 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-15798:
---
Fix Version/s: 3.2.3
   3.1.5
   3.4.0
   3.3.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> EC: Reconstruct task failed, and It would be XmitsInProgress of DN has 
> negative number
> --
>
> Key: HDFS-15798
> URL: https://issues.apache.org/jira/browse/HDFS-15798
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: HDFS-15798.001.patch, HDFS-15798.002.patch, 
> HDFS-15798.003.patch
>
>
> The EC reconstruct task failed, and the decrementXmitsInProgress of 
> processErasureCodingTasks operation abnormal value ;
>  It would be XmitsInProgress of DN has negative number, it affects NN chooses 
> pending tasks based on the ratio between the lengths of replication and 
> erasure-coded block queues.
> {code:java}
> // 1.ErasureCodingWorker.java
> public void processErasureCodingTasks(
> Collection ecTasks) {
>   for (BlockECReconstructionInfo reconInfo : ecTasks) {
> int xmitsSubmitted = 0;
> try {
>   ...
>   // It may throw IllegalArgumentException from task#stripedReader
>   // constructor.
>   final StripedBlockReconstructor task =
>   new StripedBlockReconstructor(this, stripedReconInfo);
>   if (task.hasValidTargets()) {
> // See HDFS-12044. We increase xmitsInProgress even the task is only
> // enqueued, so that
> //   1) NN will not send more tasks than what DN can execute and
> //   2) DN will not throw away reconstruction tasks, and instead keeps
> //  an unbounded number of tasks in the executor's task queue.
> xmitsSubmitted = Math.max((int)(task.getXmits() * xmitWeight), 1);
> getDatanode().incrementXmitsInProcess(xmitsSubmitted); //  task start 
> increment
> stripedReconstructionPool.submit(task);
>   } else {
> LOG.warn("No missing internal block. Skip reconstruction for task:{}",
> reconInfo);
>   }
> } catch (Throwable e) {
>   getDatanode().decrementXmitsInProgress(xmitsSubmitted); //  task failed 
> decrement,  XmitsInProgress is decremented by the previous value
>   LOG.warn("Failed to reconstruct striped block {}",
>   reconInfo.getExtendedBlock().getLocalBlock(), e);
> }
>   }
> }
> // 2.StripedBlockReconstructor.java
> public void run() {
>   try {
> initDecoderIfNecessary();
>...
>   } catch (Throwable e) {
> LOG.warn("Failed to reconstruct striped block: {}", getBlockGroup(), e);
> getDatanode().getMetrics().incrECFailedReconstructionTasks();
>   } finally {
> float xmitWeight = getErasureCodingWorker().getXmitWeight();
> // if the xmits is smaller than 1, the xmitsSubmitted should be set to 1
> // because if it set to zero, we cannot to measure the xmits submitted
> int xmitsSubmitted = Math.max((int) (getXmits() * xmitWeight), 1);
> getDatanode().decrementXmitsInProgress(xmitsSubmitted); // task complete 
> decrement
> ...
>   }
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType

2021-02-04 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei resolved HDFS-15819.

Fix Version/s: 3.4.0
   Resolution: Fixed

> Fix a codestyle issue for TestQuotaByStorageType
> 
>
> Key: HDFS-15819
> URL: https://issues.apache.org/jira/browse/HDFS-15819
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Baolong Mao
>Assignee: Baolong Mao
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType

2021-02-04 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-15819:
---
Priority: Trivial  (was: Major)

> Fix a codestyle issue for TestQuotaByStorageType
> 
>
> Key: HDFS-15819
> URL: https://issues.apache.org/jira/browse/HDFS-15819
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Baolong Mao
>Assignee: Baolong Mao
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType

2021-02-04 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-15819:
---
Issue Type: Improvement  (was: New Feature)

> Fix a codestyle issue for TestQuotaByStorageType
> 
>
> Key: HDFS-15819
> URL: https://issues.apache.org/jira/browse/HDFS-15819
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Baolong Mao
>Assignee: Baolong Mao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15819?focusedWorklogId=548007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-548007
 ]

ASF GitHub Bot logged work on HDFS-15819:
-

Author: ASF GitHub Bot
Created on: 05/Feb/21 02:11
Start Date: 05/Feb/21 02:11
Worklog Time Spent: 10m 
  Work Description: ferhui merged pull request #2681:
URL: https://github.com/apache/hadoop/pull/2681


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 548007)
Time Spent: 40m  (was: 0.5h)

> Fix a codestyle issue for TestQuotaByStorageType
> 
>
> Key: HDFS-15819
> URL: https://issues.apache.org/jira/browse/HDFS-15819
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Baolong Mao
>Assignee: Baolong Mao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15819?focusedWorklogId=548004=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-548004
 ]

ASF GitHub Bot logged work on HDFS-15819:
-

Author: ASF GitHub Bot
Created on: 05/Feb/21 02:08
Start Date: 05/Feb/21 02:08
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #2681:
URL: https://github.com/apache/hadoop/pull/2681#issuecomment-773731300


   @maobaolong Thanks for fix, merged 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 548004)
Time Spent: 0.5h  (was: 20m)

> Fix a codestyle issue for TestQuotaByStorageType
> 
>
> Key: HDFS-15819
> URL: https://issues.apache.org/jira/browse/HDFS-15819
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Baolong Mao
>Assignee: Baolong Mao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15820?focusedWorklogId=547968=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547968
 ]

ASF GitHub Bot logged work on HDFS-15820:
-

Author: ASF GitHub Bot
Created on: 05/Feb/21 00:01
Start Date: 05/Feb/21 00:01
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2682:
URL: https://github.com/apache/hadoop/pull/2682#issuecomment-773683449


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 31s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 17s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 30s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  checkstyle  |   1m 12s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 25s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 19s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +0 :ok: |  spotbugs  |   3m  4s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m  2s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   1m  0s | 
[/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2682/2/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 299 unchanged 
- 0 fixed = 301 total (was 299)  |
   | +1 :green_heart: |  mvnsite  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  12m 49s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m  3s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 209m  7s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2682/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 297m 23s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy |
   |   | hadoop.hdfs.server.namenode.ha.TestObserverNode |
   |   | hadoop.hdfs.TestFileChecksum |
   |   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
   |   | hadoop.hdfs.TestDecommissionWithStriped |
   |   | hadoop.hdfs.TestSetrepDecreasing |
   |   | hadoop.hdfs.TestErasureCodingPolicyWithSnapshot |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2682/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2682 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 9cd05c4da773 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 

[jira] [Commented] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming

2021-02-04 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279230#comment-17279230
 ] 

Kihwal Lee commented on HDFS-15813:
---

+1. Unit test failures seem unrelated.  If you can't find existing Jira for the 
failures, please file one for each.  I've looked at 
{{TestUnderReplicatedBlocks#testSetRepIncWithUnderReplicatedBlocks}} briefly. 
It appears to be a test issue.

The test artificially invalidated a replica on a node, but before the test made 
further progress, the NN fixed the under-replication by having another node 
send the block to the same node.  The test then went ahead and removed it from 
the NN's data structure (blocksmap) and called {{setReplication()}}. The NN 
picked two nodes, but one of them was the node that already has the block 
replica. It was only missing in NN's data structure. Again, this happened 
because the NN fixed the under-replication between the test deleting the 
replica and modifying the nn data structure. The replication failed with 
{{ReplicaAlreadyExistsException}}.   This kind of inconsistency does not happen 
in real clusters, but even if it did, it would be fixed when the replication 
times out.  The test is set to timeout before the default replication timeout, 
so it didn't have any chance to do that. 

> DataStreamer: keep sending heartbeat packets while streaming
> 
>
> Key: HDFS-15813
> URL: https://issues.apache.org/jira/browse/HDFS-15813
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch, 
> HDFS-15813.003.patch, HDFS-15813.004.patch
>
>
> In response to [HDFS-5032], [~daryn] made a change to our internal code to 
> ensure that heartbeats continue during data steaming, even in the face of a 
> slow disk.
> As [~kihwal] noted, absence of heartbeat during flush will be fixed in a 
> separate jira.  It doesn't look like this change was ever pushed back to 
> apache, so I am providing it here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming

2021-02-04 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279213#comment-17279213
 ] 

Hadoop QA commented on HDFS-15813:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
50s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
39s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
25s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
48s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
11s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 37s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
57s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
15s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
42s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 1s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
6s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m  
6s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
42s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
42s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green}{color} | {color:green} hadoop-hdfs-project: The 
patch generated 0 new + 86 unchanged - 1 fixed = 86 total (was 87) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
2s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 55s{color} | {color:green}{color} | {color:green} patch 

[jira] [Created] (HDFS-15821) Add metrics for in-service datanodes

2021-02-04 Thread Zehao Chen (Jira)
Zehao Chen created HDFS-15821:
-

 Summary: Add metrics for in-service datanodes
 Key: HDFS-15821
 URL: https://issues.apache.org/jira/browse/HDFS-15821
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Zehao Chen
Assignee: Zehao Chen


We currently have metrics for live datanodes but some of the datanodes may be 
in decommissioning state or maintenance state. Adding this metric allows us to 
know how many nodes are currently in service, where NumInServiceDatanodes = 
NumLiveDataNodes - NumDecomLiveDataNodes - NumInMaintenanceLiveDataNodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15757) RBF: Improving Router Connection Management

2021-02-04 Thread Fengnan Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279190#comment-17279190
 ] 

Fengnan Li commented on HDFS-15757:
---

[~hexiaoqiao] Thanks for the question. There are three latencies:
1. rpc queue time, this is measured the time in the RPC queue, which is not 
related with the change.
2. rpc processing time, this is measured before the actual proxy op (get tcp 
connection and talk to nn), which is not related.
3. proxy time, this is directly impacted since the change improves 
getConnection() a lot. I have done some flamegraphes for Router to understand 
the performance bottleneck and often I can see getConnection() in the stack 
taking a lot of time. With this change, connections are actually maintained as 
Active as possible. v.s. previously the connection left not quite closed and 
hitting the connection cap for the pool thus no more active connection can be 
created.
>From the last graph I included in the doc we can see the ProxyTime is improved.
Feel free to give a try in your setup. It's always good to have a second eye on 
it.

> RBF: Improving Router Connection Management
> ---
>
> Key: HDFS-15757
> URL: https://issues.apache.org/jira/browse/HDFS-15757
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
>  Labels: pull-request-available
> Attachments: RBF_ Improving Router Connection Management_v2.pdf, RBF_ 
> Improving Router Connection Management_v3.pdf, RBF_ Router Connection 
> Management.pdf
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> We have seen high number of connections from Router to namenodes, leaving 
> namenodes unstable.
> This ticket is trying to reduce connections through some changes. Please take 
> a look at the design and leave comments. 
> Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15817) Rename snapshots while marking them deleted

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15817?focusedWorklogId=547906=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547906
 ]

ASF GitHub Bot logged work on HDFS-15817:
-

Author: ASF GitHub Bot
Created on: 04/Feb/21 21:41
Start Date: 04/Feb/21 21:41
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2677:
URL: https://github.com/apache/hadoop/pull/2677#issuecomment-773621670


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 43s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  89m 46s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 26s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   1m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  checkstyle  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 39s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m 58s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +0 :ok: |  spotbugs  |   3m 54s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 50s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   1m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  javac  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  17m 47s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 59s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  findbugs  |   4m  9s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 209m 30s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2677/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 45s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 366m 46s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.tools.TestViewFileSystemOverloadSchemeWithDFSAdmin |
   |   | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
   |   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
   |   | hadoop.hdfs.server.namenode.snapshot.TestOrderedSnapshotDeletionGc |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2677/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2677 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 8829bc4f5317 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 5f34271bb14 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 

[jira] [Work logged] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15820?focusedWorklogId=547900=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547900
 ]

ASF GitHub Bot logged work on HDFS-15820:
-

Author: ASF GitHub Bot
Created on: 04/Feb/21 21:37
Start Date: 04/Feb/21 21:37
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2682:
URL: https://github.com/apache/hadoop/pull/2682#issuecomment-773619671


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  4s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  93m 13s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  checkstyle  |   1m  4s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m 57s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +0 :ok: |  spotbugs  |   3m 16s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 13s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  6s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  javac  |   1m  6s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 57s | 
[/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2682/1/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 299 unchanged 
- 0 fixed = 302 total (was 299)  |
   | +1 :green_heart: |  mvnsite  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  15m  7s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 49s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 22s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 15s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 221m  3s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2682/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 36s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 372m  2s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
   |   | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2682/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2682 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux aa60dff82e51 4.15.0-126-generic #129-Ubuntu SMP Mon Nov 23 
18:53:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 5f34271bb14 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 |
   | Multi-JDK versions | 

[jira] [Updated] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit

2021-02-04 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng updated HDFS-15820:
--
Status: Patch Available  (was: Open)

> Ensure snapshot root trash provisioning happens only post safe mode exit
> 
>
> Key: HDFS-15820
> URL: https://issues.apache.org/jira/browse/HDFS-15820
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, on namenode startup, snapshot trash root provisioning starts as 
> along with trash emptier service but namenode might not be out of safe mode 
> by then. This can fail the snapshot trash dir creation thereby crashing the 
> namenode. The idea here is to trigger snapshot trash provisioning only post 
> safe mode exit.
> {code:java}
> 2021-02-04 11:23:47,323 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring 
> NN shutdown. Shutting down immediately.
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create 
> directory /upgrade/.Trash. Name node is in safe mode.
> The reported blocks 0 needs additional 1383 blocks to reach the threshold 
> 0.9990 of total blocks 1385.
> The number of live datanodes 0 needs an additional 1 live datanodes to reach 
> the minimum number 1.
> Safe mode will be turned off automatically once the thresholds have been 
> reached. NamenodeHostName:quasar-brabeg-5.quasar-brabeg.root.hwx.site
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1542)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3288)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAndProvisionSnapshotTrashRoots(FSNamesystem.java:8269)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1939)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:967)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:936)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1740)
> 2021-02-04 11:23:47,334 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot 
> create directory /upgrade/.Trash. Name node is in safe mode.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15820?focusedWorklogId=547784=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547784
 ]

ASF GitHub Bot logged work on HDFS-15820:
-

Author: ASF GitHub Bot
Created on: 04/Feb/21 19:02
Start Date: 04/Feb/21 19:02
Worklog Time Spent: 10m 
  Work Description: smengcl commented on a change in pull request #2682:
URL: https://github.com/apache/hadoop/pull/2682#discussion_r570461526



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
##
@@ -8531,25 +8527,37 @@ void checkAccess(String src, FsAction mode) throws 
IOException {
* Check if snapshot roots are created for all existing snapshottable
* directories. Create them if not.
*/
-  void checkAndProvisionSnapshotTrashRoots() throws IOException {
-SnapshottableDirectoryStatus[] dirStatusList = 
getSnapshottableDirListing();
-if (dirStatusList == null) {
-  return;
-}
-for (SnapshottableDirectoryStatus dirStatus : dirStatusList) {
-  String currDir = dirStatus.getFullPath().toString();
-  if (!currDir.endsWith(Path.SEPARATOR)) {
-currDir += Path.SEPARATOR;
-  }
-  String trashPath = currDir + FileSystem.TRASH_PREFIX;
-  HdfsFileStatus fileStatus = getFileInfo(trashPath, false, false, false);
-  if (fileStatus == null) {
-LOG.info("Trash doesn't exist for snapshottable directory {}. "
-+ "Creating trash at {}", currDir, trashPath);
-PermissionStatus permissionStatus = new 
PermissionStatus(getRemoteUser()
-.getShortUserName(), null, SHARED_TRASH_PERMISSION);
-mkdirs(trashPath, permissionStatus, false);
+  @Override
+  public void checkAndProvisionSnapshotTrashRoots() {
+if (isSnapshotTrashRootEnabled) {
+  try {
+SnapshottableDirectoryStatus[] dirStatusList =
+getSnapshottableDirListing();
+if (dirStatusList == null) {
+  return;
+}
+for (SnapshottableDirectoryStatus dirStatus : dirStatusList) {
+  String currDir = dirStatus.getFullPath().toString();
+  if (!currDir.endsWith(Path.SEPARATOR)) {
+currDir += Path.SEPARATOR;
+  }
+  String trashPath = currDir + FileSystem.TRASH_PREFIX;
+  HdfsFileStatus fileStatus = getFileInfo(trashPath, false, false, 
false);
+  if (fileStatus == null) {
+LOG.info("Trash doesn't exist for snapshottable directory {}. " + 
"Creating trash at {}", currDir, trashPath);
+PermissionStatus permissionStatus =
+new PermissionStatus(getRemoteUser().getShortUserName(), null,
+SHARED_TRASH_PERMISSION);
+mkdirs(trashPath, permissionStatus, false);
+  }
+}
+  } catch (IOException e) {
+final String msg =
+"Could not provision Trash directory for existing "
++ "snapshottable directories. Exiting Namenode.";
+ExitUtil.terminate(1, msg);

Review comment:
   Pro: Terminating NN in this case is a sure good way of uncovering an 
unexpected problems instead of hiding it in the logs.
   
   Con: I wonder if we really should terminate NN when Trash directory fails to 
be deployed. We could just throw a warning message?
   
   Either way, I'm fine with both. Just a thought.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
##
@@ -2524,7 +2524,7 @@ public void testNameNodeCreateSnapshotTrashRootOnStartup()
 MiniDFSCluster cluster =
 new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
 try {
-  final DistributedFileSystem dfs = cluster.getFileSystem();
+ DistributedFileSystem dfs = cluster.getFileSystem();

Review comment:
   nit: add one more space before this line for alignment.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 547784)
Time Spent: 20m  (was: 10m)

> Ensure snapshot root trash provisioning happens only post safe mode exit
> 
>
> Key: HDFS-15820
> URL: https://issues.apache.org/jira/browse/HDFS-15820
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h

[jira] [Commented] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit

2021-02-04 Thread Siyao Meng (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279050#comment-17279050
 ] 

Siyao Meng commented on HDFS-15820:
---

[~shashikant] will do.

> Ensure snapshot root trash provisioning happens only post safe mode exit
> 
>
> Key: HDFS-15820
> URL: https://issues.apache.org/jira/browse/HDFS-15820
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, on namenode startup, snapshot trash root provisioning starts as 
> along with trash emptier service but namenode might not be out of safe mode 
> by then. This can fail the snapshot trash dir creation thereby crashing the 
> namenode. The idea here is to trigger snapshot trash provisioning only post 
> safe mode exit.
> {code:java}
> 2021-02-04 11:23:47,323 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring 
> NN shutdown. Shutting down immediately.
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create 
> directory /upgrade/.Trash. Name node is in safe mode.
> The reported blocks 0 needs additional 1383 blocks to reach the threshold 
> 0.9990 of total blocks 1385.
> The number of live datanodes 0 needs an additional 1 live datanodes to reach 
> the minimum number 1.
> Safe mode will be turned off automatically once the thresholds have been 
> reached. NamenodeHostName:quasar-brabeg-5.quasar-brabeg.root.hwx.site
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1542)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3288)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAndProvisionSnapshotTrashRoots(FSNamesystem.java:8269)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1939)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:967)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:936)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1740)
> 2021-02-04 11:23:47,334 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot 
> create directory /upgrade/.Trash. Name node is in safe mode.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming

2021-02-04 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278973#comment-17278973
 ] 

Jim Brennan commented on HDFS-15813:


patch 004 fixes the checkstyle issue.  I don't think the unit test failures are 
related.   I was not able to reproduce them locally.


> DataStreamer: keep sending heartbeat packets while streaming
> 
>
> Key: HDFS-15813
> URL: https://issues.apache.org/jira/browse/HDFS-15813
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch, 
> HDFS-15813.003.patch, HDFS-15813.004.patch
>
>
> In response to [HDFS-5032], [~daryn] made a change to our internal code to 
> ensure that heartbeats continue during data steaming, even in the face of a 
> slow disk.
> As [~kihwal] noted, absence of heartbeat during flush will be fixed in a 
> separate jira.  It doesn't look like this change was ever pushed back to 
> apache, so I am providing it here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming

2021-02-04 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HDFS-15813:
---
Attachment: HDFS-15813.004.patch

> DataStreamer: keep sending heartbeat packets while streaming
> 
>
> Key: HDFS-15813
> URL: https://issues.apache.org/jira/browse/HDFS-15813
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch, 
> HDFS-15813.003.patch, HDFS-15813.004.patch
>
>
> In response to [HDFS-5032], [~daryn] made a change to our internal code to 
> ensure that heartbeats continue during data steaming, even in the face of a 
> slow disk.
> As [~kihwal] noted, absence of heartbeat during flush will be fixed in a 
> separate jira.  It doesn't look like this change was ever pushed back to 
> apache, so I am providing it here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit

2021-02-04 Thread Shashikant Banerjee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278911#comment-17278911
 ] 

Shashikant Banerjee commented on HDFS-15820:


[~smeng], can you help review this?

> Ensure snapshot root trash provisioning happens only post safe mode exit
> 
>
> Key: HDFS-15820
> URL: https://issues.apache.org/jira/browse/HDFS-15820
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, on namenode startup, snapshot trash root provisioning starts as 
> along with trash emptier service but namenode might not be out of safe mode 
> by then. This can fail the snapshot trash dir creation thereby crashing the 
> namenode. The idea here is to trigger snapshot trash provisioning only post 
> safe mode exit.
> {code:java}
> 2021-02-04 11:23:47,323 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring 
> NN shutdown. Shutting down immediately.
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create 
> directory /upgrade/.Trash. Name node is in safe mode.
> The reported blocks 0 needs additional 1383 blocks to reach the threshold 
> 0.9990 of total blocks 1385.
> The number of live datanodes 0 needs an additional 1 live datanodes to reach 
> the minimum number 1.
> Safe mode will be turned off automatically once the thresholds have been 
> reached. NamenodeHostName:quasar-brabeg-5.quasar-brabeg.root.hwx.site
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1542)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3288)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAndProvisionSnapshotTrashRoots(FSNamesystem.java:8269)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1939)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:967)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:936)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1740)
> 2021-02-04 11:23:47,334 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot 
> create directory /upgrade/.Trash. Name node is in safe mode.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15817) Rename snapshots while marking them deleted

2021-02-04 Thread Shashikant Banerjee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278909#comment-17278909
 ] 

Shashikant Banerjee commented on HDFS-15817:


[~szetszwo], can you please help review this?

> Rename snapshots while marking them deleted 
> 
>
> Key: HDFS-15817
> URL: https://issues.apache.org/jira/browse/HDFS-15817
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> With ordered snapshot feature turned on, a snapshot will be just marked as 
> deleted but won't actually be deleted if its not the oldest one. Since, the 
> snapshot is just marked deleted, creation of  new snapshot having the same 
> name as the one which was marked deleted will fail. In order to mitigate such 
> problems, the idea here is to rename the snapshot getting marked as deleted 
> by appending deletion timestamp along with snapshot id to it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit

2021-02-04 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDFS-15820:
---
Description: 
Currently, on namenode startup, snapshot trash root provisioning starts as 
along with trash emptier service but namenode might not be out of safe mode by 
then. This can fail the snapshot trash dir creation thereby crashing the 
namenode. The idea here is to trigger snapshot trash provisioning only post 
safe mode exit.


{code:java}
2021-02-04 11:23:47,323 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: 
Error encountered requiring NN shutdown. Shutting down immediately.
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create 
directory /upgrade/.Trash. Name node is in safe mode.
The reported blocks 0 needs additional 1383 blocks to reach the threshold 
0.9990 of total blocks 1385.
The number of live datanodes 0 needs an additional 1 live datanodes to reach 
the minimum number 1.
Safe mode will be turned off automatically once the thresholds have been 
reached. NamenodeHostName:quasar-brabeg-5.quasar-brabeg.root.hwx.site
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1542)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3288)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAndProvisionSnapshotTrashRoots(FSNamesystem.java:8269)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1939)
at 
org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:967)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:936)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1740)
2021-02-04 11:23:47,334 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot 
create directory /upgrade/.Trash. Name node is in safe mode.
{code}


  was:Currently, on namenode startup, snapshot trash root provisioning starts 
as along with trash emptier service but namenode might not be out of safe mode 
by then. This can fail the snapshot trash dir creation thereby crashing the 
namenode. The idea here is to trigger snapshot trash provisioning only post 
safe mode exit.


> Ensure snapshot root trash provisioning happens only post safe mode exit
> 
>
> Key: HDFS-15820
> URL: https://issues.apache.org/jira/browse/HDFS-15820
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, on namenode startup, snapshot trash root provisioning starts as 
> along with trash emptier service but namenode might not be out of safe mode 
> by then. This can fail the snapshot trash dir creation thereby crashing the 
> namenode. The idea here is to trigger snapshot trash provisioning only post 
> safe mode exit.
> {code:java}
> 2021-02-04 11:23:47,323 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring 
> NN shutdown. Shutting down immediately.
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create 
> directory /upgrade/.Trash. Name node is in safe mode.
> The reported blocks 0 needs additional 1383 blocks to reach the threshold 
> 0.9990 of total blocks 1385.
> The number of live datanodes 0 needs an additional 1 live datanodes to reach 
> the minimum number 1.
> Safe mode will be turned off automatically once the thresholds have been 
> reached. NamenodeHostName:quasar-brabeg-5.quasar-brabeg.root.hwx.site
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1542)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3288)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAndProvisionSnapshotTrashRoots(FSNamesystem.java:8269)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1939)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
> at 
> 

[jira] [Updated] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-15820:
--
Labels: pull-request-available  (was: )

> Ensure snapshot root trash provisioning happens only post safe mode exit
> 
>
> Key: HDFS-15820
> URL: https://issues.apache.org/jira/browse/HDFS-15820
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, on namenode startup, snapshot trash root provisioning starts as 
> along with trash emptier service but namenode might not be out of safe mode 
> by then. This can fail the snapshot trash dir creation thereby crashing the 
> namenode. The idea here is to trigger snapshot trash provisioning only post 
> safe mode exit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15820?focusedWorklogId=547686=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547686
 ]

ASF GitHub Bot logged work on HDFS-15820:
-

Author: ASF GitHub Bot
Created on: 04/Feb/21 15:24
Start Date: 04/Feb/21 15:24
Worklog Time Spent: 10m 
  Work Description: bshashikant opened a new pull request #2682:
URL: https://github.com/apache/hadoop/pull/2682


   Please see https://issues.apache.org/jira/browse/HDFS-15820.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 547686)
Remaining Estimate: 0h
Time Spent: 10m

> Ensure snapshot root trash provisioning happens only post safe mode exit
> 
>
> Key: HDFS-15820
> URL: https://issues.apache.org/jira/browse/HDFS-15820
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, on namenode startup, snapshot trash root provisioning starts as 
> along with trash emptier service but namenode might not be out of safe mode 
> by then. This can fail the snapshot trash dir creation thereby crashing the 
> namenode. The idea here is to trigger snapshot trash provisioning only post 
> safe mode exit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit

2021-02-04 Thread Shashikant Banerjee (Jira)
Shashikant Banerjee created HDFS-15820:
--

 Summary: Ensure snapshot root trash provisioning happens only post 
safe mode exit
 Key: HDFS-15820
 URL: https://issues.apache.org/jira/browse/HDFS-15820
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee


Currently, on namenode startup, snapshot trash root provisioning starts as 
along with trash emptier service but namenode might not be out of safe mode by 
then. This can fail the snapshot trash dir creation thereby crashing the 
namenode. The idea here is to trigger snapshot trash provisioning only post 
safe mode exit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15790) Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist

2021-02-04 Thread David Mollitor (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278864#comment-17278864
 ] 

David Mollitor commented on HDFS-15790:
---

OK.  This looks OK with me. 

As I said, in my original issue, both engines were loaded into the same JVM and 
they would both fight at the point of registration.  It looks like things are 
now setup that they both register in the same static way and they don't explode 
when they both register.  Thanks.

> Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist
> --
>
> Key: HDFS-15790
> URL: https://issues.apache.org/jira/browse/HDFS-15790
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Changing from Protobuf 2 to Protobuf 3 broke some stuff in Apache Hive 
> project.  This was not an awesome thing to do between minor versions in 
> regards to backwards compatibility for downstream projects.
> Additionally, these two frameworks are not drop-in replacements, they have 
> some differences.  Also, Protobuf 2 is not deprecated or anything so let us 
> have both protocols available at the same time.  In Hadoop 4.x Protobuf 2 
> support can be dropped.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15792) ClasscastException while loading FSImage

2021-02-04 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278814#comment-17278814
 ] 

Renukaprasad C commented on HDFS-15792:
---

Thank [~hexiaoqiao] Even I am not happy to write such code. This is some 
workaround i have done to compile on JDK 8. Otherwise this syntax is not needed 
if we use JDK 7. 
>From the some of the logs generated above i noticed env uses Java 8. As of now 
>this is used only in Test, still do we need to focus on this checkstyle fix?
Or Do you have any alternative approach to address this compile issue ?

> ClasscastException while loading FSImage
> 
>
> Key: HDFS-15792
> URL: https://issues.apache.org/jira/browse/HDFS-15792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15792-branch-2.10.001.patch, 
> HDFS-15792-branch-2.10.002.patch, HDFS-15792.001.patch, HDFS-15792.002.patch, 
> HDFS-15792.003.patch, HDFS-15792.004.patch, HDFS-15792.005.patch, 
> HDFS-15792.addendum.001.patch, image-2021-01-27-12-00-34-846.png
>
>
> FSImage loading has failed with ClasscastException - 
> java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to 
> java.util.HashMap$TreeNode.
> This is the usage issue with Hashmap in concurrent scenarios.
> Same issue has been reported on Java & closed as usage issue.  - 
> https://bugs.openjdk.java.net/browse/JDK-8173671
> 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading 
> INODE from fsiamge. | FSImageFormatProtobuf.java:442
> java.lang.
> : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
>   at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
>   at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
>   at java.util.HashMap.treeifyBin(HashMap.java:772)
>   at java.util.HashMap.putVal(HashMap.java:644)
>   at java.util.HashMap.put(HashMap.java:612)
>   at 
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from 
> FSImageFile(file=/srv/BigData/namenode/current/fsimage_00198227480, 
> cpktTxId=00198227480) | FSImage.java:738
> java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node 
> cannot be cast to java.util.HashMap$TreeNode
>   at 
> org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:733)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1113)
>   at 
> 

[jira] [Work logged] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15819?focusedWorklogId=547603=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547603
 ]

ASF GitHub Bot logged work on HDFS-15819:
-

Author: ASF GitHub Bot
Created on: 04/Feb/21 11:45
Start Date: 04/Feb/21 11:45
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2681:
URL: https://github.com/apache/hadoop/pull/2681#issuecomment-773247145


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  9s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 59s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m 28s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +0 :ok: |  spotbugs  |   3m  6s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m  4s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 52s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  12m 36s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 47s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 22s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m  8s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 227m 44s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2681/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 45s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 313m  9s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeUUID |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2681/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2681 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux c6784034c641 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 15a1f7adfc0 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2681/1/testReport/ |
   | Max. process+thread count | 3062 (vs. ulimit of 5500) |
   | modules | C: 

[jira] [Work logged] (HDFS-15761) Dead NORMAL DN shouldn't transit to DECOMMISSIONED immediately

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15761?focusedWorklogId=547526=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547526
 ]

ASF GitHub Bot logged work on HDFS-15761:
-

Author: ASF GitHub Bot
Created on: 04/Feb/21 09:47
Start Date: 04/Feb/21 09:47
Worklog Time Spent: 10m 
  Work Description: tasanuma commented on a change in pull request #2588:
URL: https://github.com/apache/hadoop/pull/2588#discussion_r570072022



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
##
@@ -383,30 +383,70 @@ public void testDecommissionStatusAfterDNRestart() throws 
Exception {
 
   /**
* Verify the support for decommissioning a datanode that is already dead.
-   * Under this scenario the datanode should immediately be marked as
-   * DECOMMISSIONED
+   * Under this scenario the datanode should be marked as
+   * DECOMMISSION_IN_PROGRESS first. When pendingReplicationBlocksCount and
+   * underReplicatedBlocksCount are both 0, it becomes DECOMMISSIONED.
*/
   @Test(timeout=12)
   public void testDecommissionDeadDN() throws Exception {
 Logger log = Logger.getLogger(DatanodeAdminManager.class);
 log.setLevel(Level.DEBUG);
-DatanodeID dnID = cluster.getDataNodes().get(0).getDatanodeId();
-String dnName = dnID.getXferAddr();
-DataNodeProperties stoppedDN = cluster.stopDataNode(0);
-DFSTestUtil.waitForDatanodeState(cluster, dnID.getDatanodeUuid(),
-false, 3);
+
+DistributedFileSystem fileSystem = cluster.getFileSystem();
+
+// Create a file with one block. That block has one replica.
+Path f = new Path("decommission.dat");
+DFSTestUtil.createFile(fileSystem, f, fileSize, fileSize, fileSize,
+(short)1, seed);
+
+// Find the DN that owns the only replica.
+RemoteIterator fileList =
+fileSystem.listLocatedStatus(f);
+BlockLocation[] blockLocations = fileList.next().getBlockLocations();
+String[] dnNames = blockLocations[0].getNames();

Review comment:
   As the target DN is one host, we may not need to use String array and 
for-loop.
   ```java
   String dnName = blockLocations[0].getNames()[0];
   ```

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
##
@@ -383,30 +383,70 @@ public void testDecommissionStatusAfterDNRestart() throws 
Exception {
 
   /**
* Verify the support for decommissioning a datanode that is already dead.
-   * Under this scenario the datanode should immediately be marked as
-   * DECOMMISSIONED
+   * Under this scenario the datanode should be marked as
+   * DECOMMISSION_IN_PROGRESS first. When pendingReplicationBlocksCount and
+   * underReplicatedBlocksCount are both 0, it becomes DECOMMISSIONED.
*/
   @Test(timeout=12)
   public void testDecommissionDeadDN() throws Exception {
 Logger log = Logger.getLogger(DatanodeAdminManager.class);
 log.setLevel(Level.DEBUG);
-DatanodeID dnID = cluster.getDataNodes().get(0).getDatanodeId();
-String dnName = dnID.getXferAddr();
-DataNodeProperties stoppedDN = cluster.stopDataNode(0);
-DFSTestUtil.waitForDatanodeState(cluster, dnID.getDatanodeUuid(),
-false, 3);
+
+DistributedFileSystem fileSystem = cluster.getFileSystem();
+
+// Create a file with one block. That block has one replica.
+Path f = new Path("decommission.dat");
+DFSTestUtil.createFile(fileSystem, f, fileSize, fileSize, fileSize,
+(short)1, seed);
+
+// Find the DN that owns the only replica.
+RemoteIterator fileList =
+fileSystem.listLocatedStatus(f);
+BlockLocation[] blockLocations = fileList.next().getBlockLocations();
+String[] dnNames = blockLocations[0].getNames();
+
+// Stop the DN leads to 1 block under-replicated
+DataNodeProperties[] stoppedDNs = new DataNodeProperties[dnNames.length];
+for (int i = 0; i < dnNames.length; i++) {
+  stoppedDNs[i] = cluster.stopDataNode(dnNames[i]);
+}
+
 FSNamesystem fsn = cluster.getNamesystem();
 final DatanodeManager dm = fsn.getBlockManager().getDatanodeManager();
-DatanodeDescriptor dnDescriptor = dm.getDatanode(dnID);
-decommissionNode(dnName);
+final List dead = new ArrayList();
+while (true) {
+  dm.fetchDatanodes(null, dead, false);
+  if (dead.size() == 3) {

Review comment:
   Why waiting for `dead.size()==3`? They all seem to be the same host.
   
   And it would be better to use `GenericTestUtils.waitFor` instead of using 
the `while(true)` loop.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
##
@@ -453,10 +493,10 @@ public void testDecommissionLosingData() throws Exception 
{
 

[jira] [Commented] (HDFS-15812) after deleting data of hbase table hdfs size is not decreasing

2021-02-04 Thread Satya Gaurav (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278715#comment-17278715
 ] 

Satya Gaurav commented on HDFS-15812:
-

the value of hbase.master.hfilecleaner.ttl= 360 ms i.e 1hr 

> after deleting data of hbase table hdfs size is not decreasing
> --
>
> Key: HDFS-15812
> URL: https://issues.apache.org/jira/browse/HDFS-15812
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.0.2-alpha
> Environment: HDP 3.1.4.0-315
> Hbase 2.0.2.3.1.4.0-315
>Reporter: Satya Gaurav
>Priority: Major
>
> I am deleting the data from hbase table, it's deleting from hbase table but 
> the size of the hdfs directory is not reducing. Even I ran the major 
> compaction but after that also hdfs size didn't reduce. Any solution for this 
> issue?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15818) Fix TestFsDatasetImpl.testReadLockCanBeDisabledByConfig

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15818?focusedWorklogId=547500=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547500
 ]

ASF GitHub Bot logged work on HDFS-15818:
-

Author: ASF GitHub Bot
Created on: 04/Feb/21 09:05
Start Date: 04/Feb/21 09:05
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2679:
URL: https://github.com/apache/hadoop/pull/2679#issuecomment-773148937


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 33s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 12s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 59s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m  5s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +0 :ok: |  spotbugs  |   3m 19s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 15s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 56s | 
[/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2679/1/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 41 unchanged - 
0 fixed = 43 total (was 41)  |
   | +1 :green_heart: |  mvnsite  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  13m 58s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m  7s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 195m 42s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2679/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 42s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 284m 43s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2679/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2679 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 4af8bbd0fc69 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 15a1f7adfc0 |
   | Default Java | Private 

[jira] [Commented] (HDFS-15757) RBF: Improving Router Connection Management

2021-02-04 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278649#comment-17278649
 ] 

Xiaoqiao He commented on HDFS-15757:


Thanks [~fengnanli] for your works and update. The new patch looks good to me.
One concerned that if it will involve extra cost for end-to-end latency. For 
the original implement it has chance to multiplex connection for the same  request (maybe it is a low-probability) due to postpone recycle. 
After changes the reused probability will be reduced IMO. And setup new 
connection is more cost operation compare to reused. So do you observe 
end-to-end latency cost. 
Please correct me if something wrong. Thanks.

> RBF: Improving Router Connection Management
> ---
>
> Key: HDFS-15757
> URL: https://issues.apache.org/jira/browse/HDFS-15757
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
>  Labels: pull-request-available
> Attachments: RBF_ Improving Router Connection Management_v2.pdf, RBF_ 
> Improving Router Connection Management_v3.pdf, RBF_ Router Connection 
> Management.pdf
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> We have seen high number of connections from Router to namenodes, leaving 
> namenodes unstable.
> This ticket is trying to reduce connections through some changes. Please take 
> a look at the design and leave comments. 
> Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15792) ClasscastException while loading FSImage

2021-02-04 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278633#comment-17278633
 ] 

Xiaoqiao He commented on HDFS-15792:


[~prasad-acit] checkstyle issue reported by Yetus, would you mind to fix it?
BTW, IMO it is not necessary to do type conversion here. FYI. Thanks.
{code:java}
   public ImmutableList getEntries() {
-return new 
ImmutableList.Builder().addAll(referenceMap.keySet()).build();
+return new ImmutableList.Builder().addAll((new java.util.HashMap<>
+(referenceMap)).keySet()).build();
   }
{code}


> ClasscastException while loading FSImage
> 
>
> Key: HDFS-15792
> URL: https://issues.apache.org/jira/browse/HDFS-15792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15792-branch-2.10.001.patch, 
> HDFS-15792-branch-2.10.002.patch, HDFS-15792.001.patch, HDFS-15792.002.patch, 
> HDFS-15792.003.patch, HDFS-15792.004.patch, HDFS-15792.005.patch, 
> HDFS-15792.addendum.001.patch, image-2021-01-27-12-00-34-846.png
>
>
> FSImage loading has failed with ClasscastException - 
> java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to 
> java.util.HashMap$TreeNode.
> This is the usage issue with Hashmap in concurrent scenarios.
> Same issue has been reported on Java & closed as usage issue.  - 
> https://bugs.openjdk.java.net/browse/JDK-8173671
> 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading 
> INODE from fsiamge. | FSImageFormatProtobuf.java:442
> java.lang.
> : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
>   at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
>   at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
>   at java.util.HashMap.treeifyBin(HashMap.java:772)
>   at java.util.HashMap.putVal(HashMap.java:644)
>   at java.util.HashMap.put(HashMap.java:612)
>   at 
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from 
> FSImageFile(file=/srv/BigData/namenode/current/fsimage_00198227480, 
> cpktTxId=00198227480) | FSImage.java:738
> java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node 
> cannot be cast to java.util.HashMap$TreeNode
>   at 
> org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:733)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1113)
>   at 
>