[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=649828=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649828
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 13/Sep/21 05:36
Start Date: 13/Sep/21 05:36
Worklog Time Spent: 10m 
  Work Description: jojochuang commented on a change in pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#discussion_r707017464



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
##
@@ -1050,8 +1050,10 @@ static File moveBlockFiles(Block b, ReplicaInfo 
replicaInfo, File destdir)
   File dstFile)
   throws IOException {
 // Create parent folder if not exists.
-srcReplica.getFileIoProvider()
+boolean isDirCreated = srcReplica.getFileIoProvider()
 .mkdirs(srcReplica.getVolume(), dstFile.getParentFile());
+LOG.trace("Dir creation of {} on volume {} {}", dstFile.getParentFile(),

Review comment:
   File.getParentFile() allocates and returns a new string. It's not the 
kind of parameter you want to log only at TRACE level.
   
   I doubt it'd have a meaningful performance penalty though because the actual 
file system hard link probably spend much more time.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649828)
Time Spent: 6h 10m  (was: 6h)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16198) Short circuit read leaks Slot objects when InvalidToken exception is thrown

2021-09-12 Thread Eungsop Yoo (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eungsop Yoo updated HDFS-16198:
---
Description: 
In secure mode, 'dfs.block.access.token.enable' should be set 'true'. With this 
configuration SecretManager.InvalidToken exception may be thrown if the access 
token expires when we do short circuit reads. It doesn't matter because the 
failed reads will be retried. But it causes the leakage of ShortCircuitShm.Slot 
objects. 

 

We found this problem in our secure HBase clusters. The number of open file 
descriptors of RegionServers kept increasing using short circuit reading. 

!screenshot-2.png!

 

It was caused by the leakage of shared memory segments used by short circuit 
reading.
{code:java}
[root ~]# lsof -p $(ps -ef | grep proc_regionserver | grep -v grep | awk 
'{print $2}') | grep /dev/shm | wc -l
3925
[root ~]# lsof -p $(ps -ef | grep proc_regionserver | grep -v grep | awk 
'{print $2}') | grep /dev/shm | head -5
java 86309 hbase DEL REG 0,19 2308279984 
/dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_743473959
java 86309 hbase DEL REG 0,19 2306359893 
/dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_1594162967
java 86309 hbase DEL REG 0,19 2305496758 
/dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_2043027439
java 86309 hbase DEL REG 0,19 2304784261 
/dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_689571088
java 86309 hbase DEL REG 0,19 2302621988 
/dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_347008590 
{code}
 

We finally found that the root cause of this is the leakage of 
ShortCircuitShm.Slot.

 

The fix is trivial. Just free the slot when InvalidToken exception is thrown.

  was:
In secure mode, 'dfs.block.access.token.enable' should be set 'true'. With this 
configuration SecretManager.InvalidToken exception may be thrown if the access 
token expires when we do short circuit reads. It doesn't matter because the 
failed reads will be retried. But it causes the leakage of ShortCircuitShm.Slot 
objects. We found this problem in our secure HBase clusters.
 !screenshot-2.png! 

The fix is trivial. Just free the slot when InvalidToken exception is thrown.


> Short circuit read leaks Slot objects when InvalidToken exception is thrown
> ---
>
> Key: HDFS-16198
> URL: https://issues.apache.org/jira/browse/HDFS-16198
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Eungsop Yoo
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-16198.patch, screenshot-2.png
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> In secure mode, 'dfs.block.access.token.enable' should be set 'true'. With 
> this configuration SecretManager.InvalidToken exception may be thrown if the 
> access token expires when we do short circuit reads. It doesn't matter 
> because the failed reads will be retried. But it causes the leakage of 
> ShortCircuitShm.Slot objects. 
>  
> We found this problem in our secure HBase clusters. The number of open file 
> descriptors of RegionServers kept increasing using short circuit reading. 
> !screenshot-2.png!
>  
> It was caused by the leakage of shared memory segments used by short circuit 
> reading.
> {code:java}
> [root ~]# lsof -p $(ps -ef | grep proc_regionserver | grep -v grep | awk 
> '{print $2}') | grep /dev/shm | wc -l
> 3925
> [root ~]# lsof -p $(ps -ef | grep proc_regionserver | grep -v grep | awk 
> '{print $2}') | grep /dev/shm | head -5
> java 86309 hbase DEL REG 0,19 2308279984 
> /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_743473959
> java 86309 hbase DEL REG 0,19 2306359893 
> /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_1594162967
> java 86309 hbase DEL REG 0,19 2305496758 
> /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_2043027439
> java 86309 hbase DEL REG 0,19 2304784261 
> /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_689571088
> java 86309 hbase DEL REG 0,19 2302621988 
> /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_347008590 
> {code}
>  
> We finally found that the root cause of this is the leakage of 
> ShortCircuitShm.Slot.
>  
> The fix is trivial. Just free the slot when InvalidToken exception is thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16187) SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN restarts with checkpointing

2021-09-12 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee resolved HDFS-16187.

Fix Version/s: 1.3.0
   Resolution: Fixed

> SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN 
> restarts with checkpointing
> ---
>
> Key: HDFS-16187
> URL: https://issues.apache.org/jira/browse/HDFS-16187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Srinivasu Majeti
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.3.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> The below test shows the snapshot diff between across snapshots is not 
> consistent with Xattr(EZ here settinh the Xattr) across NN restarts with 
> checkpointed FsImage.
> {code:java}
> @Test
> public void testEncryptionZonesWithSnapshots() throws Exception {
>   final Path snapshottable = new Path("/zones");
>   fsWrapper.mkdir(snapshottable, FsPermission.getDirDefault(),
>   true);
>   dfsAdmin.allowSnapshot(snapshottable);
>   dfsAdmin.createEncryptionZone(snapshottable, TEST_KEY, NO_TRASH);
>   fs.createSnapshot(snapshottable, "snap1");
>   SnapshotDiffReport report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   System.out.println(report);
>   Assert.assertEquals(0, report.getDiffList().size());
>   fs.setSafeMode(SafeModeAction.SAFEMODE_ENTER);
>   fs.saveNamespace();
>   fs.setSafeMode(SafeModeAction.SAFEMODE_LEAVE);
>   cluster.restartNameNode(true);
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
> }{code}
> {code:java}
> Pre Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> Post Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> M .{code}
> The side effect of this behavior is : distcp with snapshot diff would fail 
> with below error complaining that target cluster has some data changed .
> {code:java}
> WARN tools.DistCp: The target has been modified since snapshot x
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16187) SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN restarts with checkpointing

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16187?focusedWorklogId=649822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649822
 ]

ASF GitHub Bot logged work on HDFS-16187:
-

Author: ASF GitHub Bot
Created on: 13/Sep/21 04:55
Start Date: 13/Sep/21 04:55
Worklog Time Spent: 10m 
  Work Description: bshashikant merged pull request #3340:
URL: https://github.com/apache/hadoop/pull/3340


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649822)
Time Spent: 3h 50m  (was: 3h 40m)

> SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN 
> restarts with checkpointing
> ---
>
> Key: HDFS-16187
> URL: https://issues.apache.org/jira/browse/HDFS-16187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Srinivasu Majeti
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The below test shows the snapshot diff between across snapshots is not 
> consistent with Xattr(EZ here settinh the Xattr) across NN restarts with 
> checkpointed FsImage.
> {code:java}
> @Test
> public void testEncryptionZonesWithSnapshots() throws Exception {
>   final Path snapshottable = new Path("/zones");
>   fsWrapper.mkdir(snapshottable, FsPermission.getDirDefault(),
>   true);
>   dfsAdmin.allowSnapshot(snapshottable);
>   dfsAdmin.createEncryptionZone(snapshottable, TEST_KEY, NO_TRASH);
>   fs.createSnapshot(snapshottable, "snap1");
>   SnapshotDiffReport report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   System.out.println(report);
>   Assert.assertEquals(0, report.getDiffList().size());
>   fs.setSafeMode(SafeModeAction.SAFEMODE_ENTER);
>   fs.saveNamespace();
>   fs.setSafeMode(SafeModeAction.SAFEMODE_LEAVE);
>   cluster.restartNameNode(true);
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
> }{code}
> {code:java}
> Pre Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> Post Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> M .{code}
> The side effect of this behavior is : distcp with snapshot diff would fail 
> with below error complaining that target cluster has some data changed .
> {code:java}
> WARN tools.DistCp: The target has been modified since snapshot x
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16187) SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN restarts with checkpointing

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16187?focusedWorklogId=649823=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649823
 ]

ASF GitHub Bot logged work on HDFS-16187:
-

Author: ASF GitHub Bot
Created on: 13/Sep/21 04:55
Start Date: 13/Sep/21 04:55
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #3340:
URL: https://github.com/apache/hadoop/pull/3340#issuecomment-917840218


   Thanks @cnauroth and @szetszwo for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649823)
Time Spent: 4h  (was: 3h 50m)

> SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN 
> restarts with checkpointing
> ---
>
> Key: HDFS-16187
> URL: https://issues.apache.org/jira/browse/HDFS-16187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Srinivasu Majeti
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> The below test shows the snapshot diff between across snapshots is not 
> consistent with Xattr(EZ here settinh the Xattr) across NN restarts with 
> checkpointed FsImage.
> {code:java}
> @Test
> public void testEncryptionZonesWithSnapshots() throws Exception {
>   final Path snapshottable = new Path("/zones");
>   fsWrapper.mkdir(snapshottable, FsPermission.getDirDefault(),
>   true);
>   dfsAdmin.allowSnapshot(snapshottable);
>   dfsAdmin.createEncryptionZone(snapshottable, TEST_KEY, NO_TRASH);
>   fs.createSnapshot(snapshottable, "snap1");
>   SnapshotDiffReport report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   System.out.println(report);
>   Assert.assertEquals(0, report.getDiffList().size());
>   fs.setSafeMode(SafeModeAction.SAFEMODE_ENTER);
>   fs.saveNamespace();
>   fs.setSafeMode(SafeModeAction.SAFEMODE_LEAVE);
>   cluster.restartNameNode(true);
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
> }{code}
> {code:java}
> Pre Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> Post Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> M .{code}
> The side effect of this behavior is : distcp with snapshot diff would fail 
> with below error complaining that target cluster has some data changed .
> {code:java}
> WARN tools.DistCp: The target has been modified since snapshot x
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16187) SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN restarts with checkpointing

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16187?focusedWorklogId=649821=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649821
 ]

ASF GitHub Bot logged work on HDFS-16187:
-

Author: ASF GitHub Bot
Created on: 13/Sep/21 04:53
Start Date: 13/Sep/21 04:53
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #3340:
URL: https://github.com/apache/hadoop/pull/3340#issuecomment-917839405


   Thanks @szetszwo . I have retriggered the CI and the unit test failures in 
the report are not related.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649821)
Time Spent: 3h 40m  (was: 3.5h)

> SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN 
> restarts with checkpointing
> ---
>
> Key: HDFS-16187
> URL: https://issues.apache.org/jira/browse/HDFS-16187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Srinivasu Majeti
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> The below test shows the snapshot diff between across snapshots is not 
> consistent with Xattr(EZ here settinh the Xattr) across NN restarts with 
> checkpointed FsImage.
> {code:java}
> @Test
> public void testEncryptionZonesWithSnapshots() throws Exception {
>   final Path snapshottable = new Path("/zones");
>   fsWrapper.mkdir(snapshottable, FsPermission.getDirDefault(),
>   true);
>   dfsAdmin.allowSnapshot(snapshottable);
>   dfsAdmin.createEncryptionZone(snapshottable, TEST_KEY, NO_TRASH);
>   fs.createSnapshot(snapshottable, "snap1");
>   SnapshotDiffReport report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   System.out.println(report);
>   Assert.assertEquals(0, report.getDiffList().size());
>   fs.setSafeMode(SafeModeAction.SAFEMODE_ENTER);
>   fs.saveNamespace();
>   fs.setSafeMode(SafeModeAction.SAFEMODE_LEAVE);
>   cluster.restartNameNode(true);
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
> }{code}
> {code:java}
> Pre Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> Post Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> M .{code}
> The side effect of this behavior is : distcp with snapshot diff would fail 
> with below error complaining that target cluster has some data changed .
> {code:java}
> WARN tools.DistCp: The target has been modified since snapshot x
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16222) Fix ViewDFS with mount points for HDFS only API

2021-09-12 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-16222:

Description: Presently, For HDFS specific API, The ones not present in 
ViewFileSystem. The resolved path seems to be coming wrong.  (was: Presently, 
For HDFS specific API, The resolved path seems to be coming wrong.)

> Fix ViewDFS with mount points for HDFS only API
> ---
>
> Key: HDFS-16222
> URL: https://issues.apache.org/jira/browse/HDFS-16222
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Attachments: test_to_repro.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Presently, For HDFS specific API, The ones not present in ViewFileSystem. The 
> resolved path seems to be coming wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16221) RBF: Add usage of refreshCallQueue for Router

2021-09-12 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei resolved HDFS-16221.

Resolution: Fixed

> RBF: Add usage of refreshCallQueue for Router
> -
>
> Key: HDFS-16221
> URL: https://issues.apache.org/jira/browse/HDFS-16221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In HDFS-16210, the feature of refreshCallQueue is added to RouterAdmin, but 
> the usageInfo is not added. So when user is typing "hdfs dfsrouteraadmin", 
> the option of "refreshCallQueue" will not be shown.
> This ticket is to add the usage information for refreshCallQueue for Router.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16221) RBF: Add usage of refreshCallQueue for Router

2021-09-12 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-16221:
---
Fix Version/s: 3.3.2

> RBF: Add usage of refreshCallQueue for Router
> -
>
> Key: HDFS-16221
> URL: https://issues.apache.org/jira/browse/HDFS-16221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In HDFS-16210, the feature of refreshCallQueue is added to RouterAdmin, but 
> the usageInfo is not added. So when user is typing "hdfs dfsrouteraadmin", 
> the option of "refreshCallQueue" will not be shown.
> This ticket is to add the usage information for refreshCallQueue for Router.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-16221) RBF: Add usage of refreshCallQueue for Router

2021-09-12 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei reopened HDFS-16221:


> RBF: Add usage of refreshCallQueue for Router
> -
>
> Key: HDFS-16221
> URL: https://issues.apache.org/jira/browse/HDFS-16221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In HDFS-16210, the feature of refreshCallQueue is added to RouterAdmin, but 
> the usageInfo is not added. So when user is typing "hdfs dfsrouteraadmin", 
> the option of "refreshCallQueue" will not be shown.
> This ticket is to add the usage information for refreshCallQueue for Router.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16220) [FGL]Configurable INodeMap#NAMESPACE_KEY_DEPTH_RANGES_STATIC

2021-09-12 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413853#comment-17413853
 ] 

JiangHua Zhu commented on HDFS-16220:
-

Thanks [~prasad-acit] for the comment.


> [FGL]Configurable INodeMap#NAMESPACE_KEY_DEPTH_RANGES_STATIC
> 
>
> Key: HDFS-16220
> URL: https://issues.apache.org/jira/browse/HDFS-16220
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In INodeMap, NAMESPACE_KEY_DEPTH and NUM_RANGES_STATIC are a fixed value, we 
> should make it configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=649797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649797
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 23:42
Start Date: 12/Sep/21 23:42
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-917734240


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 19s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  2s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 12s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 33s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 54s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 240 unchanged - 3 
fixed = 240 total (was 243)  |
   | +1 :green_heart: |  mvnsite  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 47s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  9s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 22s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 229m 10s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 315m 32s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/12/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3386 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux f91c3af3b149 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 8e1eab0b1cd4dddbddc9cea02a657c87bab2b9bb |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/12/testReport/ |
   | Max. process+thread count | 3406 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/12/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
 

[jira] [Work logged] (HDFS-16203) Discover datanodes with unbalanced block pool usage by the standard deviation

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16203?focusedWorklogId=649738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649738
 ]

ASF GitHub Bot logged work on HDFS-16203:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 20:20
Start Date: 12/Sep/21 20:20
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3366:
URL: https://github.com/apache/hadoop/pull/3366#issuecomment-917703816


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 40s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  jshint  |   0m  1s |  |  jshint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 44s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  20m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   4m 58s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   4m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  7s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 19s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m  7s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m 46s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 35s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   4m 49s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   4m 49s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   4m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   4m 28s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  5s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 117 unchanged - 9 fixed = 117 total (was 126)  |
   | +1 :green_heart: |  mvnsite  |   2m 36s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   2m 43s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   7m  1s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  14m 38s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 23s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 234m 54s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3366/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  unit  |  20m 48s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3366/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 53s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 378m 10s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   |   | hadoop.hdfs.server.federation.router.TestRouterRPCClientRetries |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3366/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3366 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell 

[jira] [Commented] (HDFS-16220) [FGL]Configurable INodeMap#NAMESPACE_KEY_DEPTH_RANGES_STATIC

2021-09-12 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413768#comment-17413768
 ] 

Renukaprasad C commented on HDFS-16220:
---

Thanks [~jianghuazhu] for reporting the issue and the patch.

Configuration file has issue, which results many test failures. You may correct 
it, should be able to get rid of these unwanted results.

Also, there are some static issues reported please take a look.

[~shv]  [~xinglin] when you feel free, can you please take a look at the PR? 

> [FGL]Configurable INodeMap#NAMESPACE_KEY_DEPTH_RANGES_STATIC
> 
>
> Key: HDFS-16220
> URL: https://issues.apache.org/jira/browse/HDFS-16220
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In INodeMap, NAMESPACE_KEY_DEPTH and NUM_RANGES_STATIC are a fixed value, we 
> should make it configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=649723=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649723
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 18:16
Start Date: 12/Sep/21 18:16
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on a change in pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#discussion_r706874063



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FileIoProvider.java
##
@@ -513,6 +513,7 @@ public boolean fullyDelete(@Nullable FsVolumeSpi volume, 
File dir) {
 try {
   faultInjectorEventHook.beforeMetadataOp(volume, DELETE);
   boolean deleted = FileUtil.fullyDelete(dir);
+  LOG.trace("Deletion of dir {} {}", dir, deleted ? "succeeded" : 
"failed");

Review comment:
   While debugging this issue, I felt that a log line here would make it 
easier for operator to confirm the output of Delete operation, and trace log 
level is sufficient on the other hand, hence I thought of keeping this log line.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649723)
Time Spent: 5h 50m  (was: 5h 40m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=649722=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649722
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 18:13
Start Date: 12/Sep/21 18:13
Worklog Time Spent: 10m 
  Work Description: LeonGao91 commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-917684061


   thanks @virajjasani , looks good to me. Just leaving few nit comments.
   
   @ferhui @ayushtkn Could you please take a second look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649722)
Time Spent: 5h 40m  (was: 5.5h)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=649721=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649721
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 18:09
Start Date: 12/Sep/21 18:09
Worklog Time Spent: 10m 
  Work Description: LeonGao91 commented on a change in pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#discussion_r706873351



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java
##
@@ -1307,35 +1310,42 @@ public void testDnRestartWithHardLinkInTmp() {
* DiskScanner should clean up the hardlink correctly.
*/
   @Test(timeout = 3)
-  public void testDnRestartWithHardLink() {
+  public void testDnRestartWithHardLink() throws Exception {
 MiniDFSCluster cluster = null;
 try {
   conf.setBoolean(DFSConfigKeys
   .DFS_DATANODE_ALLOW_SAME_DISK_TIERING, true);
   conf.setDouble(DFSConfigKeys
   .DFS_DATANODE_RESERVE_FOR_ARCHIVE_DEFAULT_PERCENTAGE, 0.5);
+  conf.setBoolean(DFSConfigKeys.DFS_DATANODE_DUPLICATE_REPLICA_DELETION,

Review comment:
   Maybe add 1 line on why we added this? Something like:
   
   "This is to guarantee the corner case that datanode restart in the middle of 
the block movement may leave uncleaned hardlink."




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649721)
Time Spent: 5.5h  (was: 5h 20m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=649720=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649720
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 18:04
Start Date: 12/Sep/21 18:04
Worklog Time Spent: 10m 
  Work Description: LeonGao91 commented on a change in pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#discussion_r706872610



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FileIoProvider.java
##
@@ -513,6 +513,7 @@ public boolean fullyDelete(@Nullable FsVolumeSpi volume, 
File dir) {
 try {
   faultInjectorEventHook.beforeMetadataOp(volume, DELETE);
   boolean deleted = FileUtil.fullyDelete(dir);
+  LOG.trace("Deletion of dir {} {}", dir, deleted ? "succeeded" : 
"failed");

Review comment:
   Do we need to add the trace logs as it is unrelated to the fix?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649720)
Time Spent: 5h 20m  (was: 5h 10m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=649719=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649719
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 17:17
Start Date: 12/Sep/21 17:17
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-917675592


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 59s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 17s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  18m 52s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 54s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 240 unchanged - 3 
fixed = 240 total (was 243)  |
   | +1 :green_heart: |  mvnsite  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  19m  2s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 350m 48s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 39s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 443m  7s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/11/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3386 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 9757ee312e51 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 
19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / cf7dc937fb00691f74ca7a54292735d87dd8b7af |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/11/testReport/ |
   | Max. process+thread count | 2140 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/11/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=649716=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649716
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 15:06
Start Date: 12/Sep/21 15:06
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-917653913


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 52s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  30m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 28s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  5s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 23s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 54s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 240 unchanged - 3 
fixed = 240 total (was 243)  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 46s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  7s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  15m 59s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 231m 33s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 48s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 316m  1s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/9/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3386 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 935d6934fa36 4.15.0-151-generic #157-Ubuntu SMP Fri Jul 9 
23:07:57 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / cf7dc937fb00691f74ca7a54292735d87dd8b7af |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/9/testReport/ |
   | Max. process+thread count | 2938 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/9/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=649715=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649715
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 15:02
Start Date: 12/Sep/21 15:02
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-917653119


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 45s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  30m 56s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  3s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 10s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 27s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 52s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 240 unchanged - 3 
fixed = 240 total (was 243)  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 46s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  6s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 14s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 231m  9s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 316m  3s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/8/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3386 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux eb4ed62f2b49 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 31cd896568bcd3f95e3cd6741dc2d4563ea8ab85 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/8/testReport/ |
   | Max. process+thread count | 3081 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/8/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   

[jira] [Work logged] (HDFS-16187) SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN restarts with checkpointing

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16187?focusedWorklogId=649711=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649711
 ]

ASF GitHub Bot logged work on HDFS-16187:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 14:48
Start Date: 12/Sep/21 14:48
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3340:
URL: https://github.com/apache/hadoop/pull/3340#issuecomment-917650593


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 58s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 55s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 19s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m 22s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 54s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 184 unchanged - 1 
fixed = 184 total (was 185)  |
   | +1 :green_heart: |  mvnsite  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 19s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  18m 50s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 316m 23s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3340/8/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 39s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 409m 35s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.mover.TestMover |
   |   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3340/8/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3340 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux d0f51dac6e97 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 
19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 19ac0d5ae3b7a7f8fb16abe3265a2428fb8f9335 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 

[jira] [Commented] (HDFS-16195) Fix log message when choosing storage groups for block movement in balancer

2021-09-12 Thread Preeti (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413719#comment-17413719
 ] 

Preeti commented on HDFS-16195:
---

[~prasad-acit] [~vjasani] thank you. Sorry it took me a while to get back. Are 
you suggesting that the patch is ready to be merged or there is more I need to 
correct?

 

[~hemanthboyina] what do you mean by raise a PR? Against which repository? I 
thought all reviews are done through this JIRA.

> Fix log message when choosing storage groups for block movement in balancer
> ---
>
> Key: HDFS-16195
> URL: https://issues.apache.org/jira/browse/HDFS-16195
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: Preeti
>Assignee: Preeti
>Priority: Major
> Attachments: HADOOP-16195.001.patch, HADOOP-16195.002.patch, 
> HADOOP-16195.003.patch, hadoop-format.xml
>
>
> Correct the log message in line with the logic associated with
> moving blocks in chooseStorageGroups() in the balancer. All log lines should 
> indicate from which storage source the blocks are being moved correctly to 
> avoid ambiguity. Right now one of the log lines is incorrect: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java#L555]
>  which indicates that storage blocks are moved from underUtilized to 
> aboveAvgUtilized nodes, while it is actually the other way around in the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16203) Discover datanodes with unbalanced block pool usage by the standard deviation

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16203?focusedWorklogId=649706=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649706
 ]

ASF GitHub Bot logged work on HDFS-16203:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 13:48
Start Date: 12/Sep/21 13:48
Worklog Time Spent: 10m 
  Work Description: tomscut commented on a change in pull request #3366:
URL: https://github.com/apache/hadoop/pull/3366#discussion_r706840884



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
##
@@ -6537,14 +6537,45 @@ public String getLiveNodes() {
   if (node.getUpgradeDomain() != null) {
 innerinfo.put("upgradeDomain", node.getUpgradeDomain());
   }
+  StorageReport[] storageReports = node.getStorageReports();
+  innerinfo.put("blockPoolUsedPercentStdDev",
+  getBlockPoolUsedPercentStdDev(storageReports));
   info.put(node.getXferAddrWithHostname(), innerinfo.build());
 }
 return JSON.toString(info);
   }
 
+  /**
+   * Return the standard deviation of storage block pool usage.
+   */
+  @VisibleForTesting
+  public float getBlockPoolUsedPercentStdDev(StorageReport[] storageReports) {
+ArrayList usagePercentList = new ArrayList<>();
+float totalUsagePercent = 0.0f;
+float dev = 0.0f;
+
+if (storageReports.length == 0) {
+  return dev;
+}
+
+for (StorageReport s : storageReports) {
+  usagePercentList.add(s.getBlockPoolUsagePercent());
+  totalUsagePercent += s.getBlockPoolUsagePercent();
+}
+
+totalUsagePercent /= storageReports.length;
+Collections.sort(usagePercentList);

Review comment:
   @ferhui A float or double may lose precision when being evaluated. When 
multiple values are operated on in different order, the results may be 
inconsistent. After remote ```Collections.sort(usagePercentList);```, I only 
take two decimal points to assert. Please take a look at this. Thank you very 
much.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649706)
Time Spent: 3h  (was: 2h 50m)

> Discover datanodes with unbalanced block pool usage by the standard deviation
> -
>
> Key: HDFS-16203
> URL: https://issues.apache.org/jira/browse/HDFS-16203
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-09-01-19-16-27-172.png
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> *Discover datanodes with unbalanced volume usage by the standard deviation.*
> *In some scenarios, we may cause unbalanced datanode disk usage:*
>  1. Repair the damaged disk and make it online again.
>  2. Add disks to some Datanodes.
>  3. Some disks are damaged, resulting in slow data writing.
>  4. Use some custom volume choosing policies.
> In the case of unbalanced disk usage, a sudden increase in datanode write 
> traffic may result in busy disk I/O with low volume usage, resulting in 
> decreased throughput across datanodes.
> We need to find these nodes in time to do diskBalance, or other processing. 
> Based on the volume usage of each datanode, we can calculate the standard 
> deviation of the volume usage. The more unbalanced the volume, the higher the 
> standard deviation.
> *We can display the result on the Web of namenode, and then sorting directly 
> to find the nodes where the volumes usages are unbalanced.*
> *{color:#172b4d}This interface is only used to obtain metrics and does not 
> adversely affect namenode performance.{color}*
>  
> {color:#172b4d}!image-2021-09-01-19-16-27-172.png|width=581,height=216!{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16203) Discover datanodes with unbalanced block pool usage by the standard deviation

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16203?focusedWorklogId=649705=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649705
 ]

ASF GitHub Bot logged work on HDFS-16203:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 13:46
Start Date: 12/Sep/21 13:46
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3366:
URL: https://github.com/apache/hadoop/pull/3366#issuecomment-917639519


   > Overall it looks good.
   > DataNode Info shows on namenode UI and Router UI, It is reasonable to add 
this into router UI.
   > Maybe it is like namenode, just lines code changed. We can review it here.
   
   Hi @ferhui , I added this into router UI and did some tests. 
   
   There are two NS in our test cluster. NS1 supports this feature, and NS2 
does not. In this case, the Router UI is shown as follows:
   
![router](https://user-images.githubusercontent.com/55134131/132989998-98b08889-bea0-4066-8f66-3b07cbe7cb3a.jpg)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649705)
Time Spent: 2h 50m  (was: 2h 40m)

> Discover datanodes with unbalanced block pool usage by the standard deviation
> -
>
> Key: HDFS-16203
> URL: https://issues.apache.org/jira/browse/HDFS-16203
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-09-01-19-16-27-172.png
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> *Discover datanodes with unbalanced volume usage by the standard deviation.*
> *In some scenarios, we may cause unbalanced datanode disk usage:*
>  1. Repair the damaged disk and make it online again.
>  2. Add disks to some Datanodes.
>  3. Some disks are damaged, resulting in slow data writing.
>  4. Use some custom volume choosing policies.
> In the case of unbalanced disk usage, a sudden increase in datanode write 
> traffic may result in busy disk I/O with low volume usage, resulting in 
> decreased throughput across datanodes.
> We need to find these nodes in time to do diskBalance, or other processing. 
> Based on the volume usage of each datanode, we can calculate the standard 
> deviation of the volume usage. The more unbalanced the volume, the higher the 
> standard deviation.
> *We can display the result on the Web of namenode, and then sorting directly 
> to find the nodes where the volumes usages are unbalanced.*
> *{color:#172b4d}This interface is only used to obtain metrics and does not 
> adversely affect namenode performance.{color}*
>  
> {color:#172b4d}!image-2021-09-01-19-16-27-172.png|width=581,height=216!{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16203) Discover datanodes with unbalanced block pool usage by the standard deviation

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16203?focusedWorklogId=649704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649704
 ]

ASF GitHub Bot logged work on HDFS-16203:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 13:43
Start Date: 12/Sep/21 13:43
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3366:
URL: https://github.com/apache/hadoop/pull/3366#issuecomment-917639022


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m  0s |  |  Docker mode activated.  |
   | -1 :x: |  patch  |   0m 19s |  |  
https://github.com/apache/hadoop/pull/3366 does not apply to trunk. Rebase 
required? Wrong Branch? See 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.  
|
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/3366 |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3366/5/console |
   | versions | git=2.17.1 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649704)
Time Spent: 2h 40m  (was: 2.5h)

> Discover datanodes with unbalanced block pool usage by the standard deviation
> -
>
> Key: HDFS-16203
> URL: https://issues.apache.org/jira/browse/HDFS-16203
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-09-01-19-16-27-172.png
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Discover datanodes with unbalanced volume usage by the standard deviation.*
> *In some scenarios, we may cause unbalanced datanode disk usage:*
>  1. Repair the damaged disk and make it online again.
>  2. Add disks to some Datanodes.
>  3. Some disks are damaged, resulting in slow data writing.
>  4. Use some custom volume choosing policies.
> In the case of unbalanced disk usage, a sudden increase in datanode write 
> traffic may result in busy disk I/O with low volume usage, resulting in 
> decreased throughput across datanodes.
> We need to find these nodes in time to do diskBalance, or other processing. 
> Based on the volume usage of each datanode, we can calculate the standard 
> deviation of the volume usage. The more unbalanced the volume, the higher the 
> standard deviation.
> *We can display the result on the Web of namenode, and then sorting directly 
> to find the nodes where the volumes usages are unbalanced.*
> *{color:#172b4d}This interface is only used to obtain metrics and does not 
> adversely affect namenode performance.{color}*
>  
> {color:#172b4d}!image-2021-09-01-19-16-27-172.png|width=581,height=216!{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16197) Simplify getting NNStorage in FSNamesystem

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16197?focusedWorklogId=649691=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649691
 ]

ASF GitHub Bot logged work on HDFS-16197:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 11:59
Start Date: 12/Sep/21 11:59
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #3357:
URL: https://github.com/apache/hadoop/pull/3357#issuecomment-917622023


   @jianghuazhu Thanks for contribution, @ayushtkn Thanks for review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649691)
Time Spent: 3h 10m  (was: 3h)

> Simplify getting NNStorage in FSNamesystem
> --
>
> Key: HDFS-16197
> URL: https://issues.apache.org/jira/browse/HDFS-16197
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> In FSNamesystem, there are many places where NNStorage needs to be used 
> (according to preliminary statistics, there are 15 times), and these places 
> are obtained using "getFSImage().getStorage()". We should try to use a 
> simpler way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16197) Simplify getting NNStorage in FSNamesystem

2021-09-12 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei resolved HDFS-16197.

Fix Version/s: 3.4.0
   Resolution: Fixed

> Simplify getting NNStorage in FSNamesystem
> --
>
> Key: HDFS-16197
> URL: https://issues.apache.org/jira/browse/HDFS-16197
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> In FSNamesystem, there are many places where NNStorage needs to be used 
> (according to preliminary statistics, there are 15 times), and these places 
> are obtained using "getFSImage().getStorage()". We should try to use a 
> simpler way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16197) Simplify getting NNStorage in FSNamesystem

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16197?focusedWorklogId=649690=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649690
 ]

ASF GitHub Bot logged work on HDFS-16197:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 11:58
Start Date: 12/Sep/21 11:58
Worklog Time Spent: 10m 
  Work Description: ferhui merged pull request #3357:
URL: https://github.com/apache/hadoop/pull/3357


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649690)
Time Spent: 3h  (was: 2h 50m)

> Simplify getting NNStorage in FSNamesystem
> --
>
> Key: HDFS-16197
> URL: https://issues.apache.org/jira/browse/HDFS-16197
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In FSNamesystem, there are many places where NNStorage needs to be used 
> (according to preliminary statistics, there are 15 times), and these places 
> are obtained using "getFSImage().getStorage()". We should try to use a 
> simpler way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16210) RBF: Add the option of refreshCallQueue to RouterAdmin

2021-09-12 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413697#comment-17413697
 ] 

Hui Fei commented on HDFS-16210:


[~Symious] Great ! You can submit a PR for branch-3.2

> RBF: Add the option of refreshCallQueue to RouterAdmin
> --
>
> Key: HDFS-16210
> URL: https://issues.apache.org/jira/browse/HDFS-16210
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> We enabled FairCallQueue to RouterRpcServer, but Router can not 
> refreshCallQueue as NameNode does.
> This ticket is to enable the refreshCallQueue for Router so that we don't 
> have to restart the Routers when updating FairCallQueue configurations.
>  
> The option is not to refreshCallQueue to NameNodes, just trying to refresh 
> the callQueue of Router itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16221) RBF: Add usage of refreshCallQueue for Router

2021-09-12 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413696#comment-17413696
 ] 

Hui Fei commented on HDFS-16221:


cherry-pick to branch-3.3 tomorrow

> RBF: Add usage of refreshCallQueue for Router
> -
>
> Key: HDFS-16221
> URL: https://issues.apache.org/jira/browse/HDFS-16221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In HDFS-16210, the feature of refreshCallQueue is added to RouterAdmin, but 
> the usageInfo is not added. So when user is typing "hdfs dfsrouteraadmin", 
> the option of "refreshCallQueue" will not be shown.
> This ticket is to add the usage information for refreshCallQueue for Router.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16221) RBF: Add usage of refreshCallQueue for Router

2021-09-12 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei resolved HDFS-16221.

Fix Version/s: 3.4.0
   Resolution: Fixed

> RBF: Add usage of refreshCallQueue for Router
> -
>
> Key: HDFS-16221
> URL: https://issues.apache.org/jira/browse/HDFS-16221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In HDFS-16210, the feature of refreshCallQueue is added to RouterAdmin, but 
> the usageInfo is not added. So when user is typing "hdfs dfsrouteraadmin", 
> the option of "refreshCallQueue" will not be shown.
> This ticket is to add the usage information for refreshCallQueue for Router.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16221) RBF: Add usage of refreshCallQueue for Router

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16221?focusedWorklogId=649687=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649687
 ]

ASF GitHub Bot logged work on HDFS-16221:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 11:53
Start Date: 12/Sep/21 11:53
Worklog Time Spent: 10m 
  Work Description: ferhui merged pull request #3421:
URL: https://github.com/apache/hadoop/pull/3421


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649687)
Time Spent: 50m  (was: 40m)

> RBF: Add usage of refreshCallQueue for Router
> -
>
> Key: HDFS-16221
> URL: https://issues.apache.org/jira/browse/HDFS-16221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In HDFS-16210, the feature of refreshCallQueue is added to RouterAdmin, but 
> the usageInfo is not added. So when user is typing "hdfs dfsrouteraadmin", 
> the option of "refreshCallQueue" will not be shown.
> This ticket is to add the usage information for refreshCallQueue for Router.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16221) RBF: Add usage of refreshCallQueue for Router

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16221?focusedWorklogId=649688=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649688
 ]

ASF GitHub Bot logged work on HDFS-16221:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 11:53
Start Date: 12/Sep/21 11:53
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #3421:
URL: https://github.com/apache/hadoop/pull/3421#issuecomment-917621070


   @symious Thanks for contribution, @goiri Thanks for review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649688)
Time Spent: 1h  (was: 50m)

> RBF: Add usage of refreshCallQueue for Router
> -
>
> Key: HDFS-16221
> URL: https://issues.apache.org/jira/browse/HDFS-16221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In HDFS-16210, the feature of refreshCallQueue is added to RouterAdmin, but 
> the usageInfo is not added. So when user is typing "hdfs dfsrouteraadmin", 
> the option of "refreshCallQueue" will not be shown.
> This ticket is to add the usage information for refreshCallQueue for Router.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16223) AvailableSpaceRackFaultTolerantBlockPlacementPolicy should use chooseRandomWithStorageTypeTwoTrial() for better performance.

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16223?focusedWorklogId=649679=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649679
 ]

ASF GitHub Bot logged work on HDFS-16223:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 10:06
Start Date: 12/Sep/21 10:06
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #3424:
URL: https://github.com/apache/hadoop/pull/3424#issuecomment-917604882


   With this, it has better alignment with `AvailableSpaceBlockPlacementPolicy` 
as well as default impl of `BlockPlacementPolicyDefault`. +1 (non-binding)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649679)
Time Spent: 40m  (was: 0.5h)

> AvailableSpaceRackFaultTolerantBlockPlacementPolicy should use 
> chooseRandomWithStorageTypeTwoTrial() for better performance.
> 
>
> Key: HDFS-16223
> URL: https://issues.apache.org/jira/browse/HDFS-16223
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Use chooseRandomWithStorageTypeTwoTrial as AvailableSpaceBlockPlacementPolicy 
> does.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=649674=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649674
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 08:00
Start Date: 12/Sep/21 08:00
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-917580384


   > Depends on the sequence of loading cache and constructing volume map, 
sometimes hardlink will just be removed by datanode restart.
   
   Exactly  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649674)
Time Spent: 4h 40m  (was: 4.5h)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=649673=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649673
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 08:00
Start Date: 12/Sep/21 08:00
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-917580278


   I see, yes this seems finer-grained control over the target we want to 
achieve, let me spend some more time with this and will make changes 
accordingly after verifying. Thanks @LeonGao91 for providing your viewpoint on 
this!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649673)
Time Spent: 4.5h  (was: 4h 20m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=649668=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649668
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 12/Sep/21 07:33
Start Date: 12/Sep/21 07:33
Worklog Time Spent: 10m 
  Work Description: LeonGao91 commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-917576109


   @virajjasani Yes, I just did some more research. Depends on the sequence of 
loading cache and constructing volume map, sometimes hardlink will just be 
removed by datanode restart. So the corner case that needs to be tested here is 
not guaranteed.
   
   I think a less disrupting way to fix this is by controlling 
"deleteDuplicateReplicas" value in the BlockPoolSlice. We can disable this 
until the DirectoryScanner kicks in. Therefore we only need to add a 
VisibleForTesting method in BlockPoolSlice
   
   ```
 @VisibleForTesting
 void setDeleteDuplicateReplicasForTesting(boolean value) {
   deleteDuplicateReplicas = value;
 }
   ```
   
   And in the test, we can do this
   
   ```
   // In the beginning
   conf.setBoolean(
   DFSConfigKeys.DFS_DATANODE_DUPLICATE_REPLICA_DELETION,
   false);
   ...
   // When testing DirectoryScanner should clean up the hardlink
   DirectoryScanner scanner = new DirectoryScanner(
   cluster.getDataNodes().get(0).getFSDataset(), conf);
   FsVolumeImpl volume = (FsVolumeImpl) 
cluster.getDataNodes().get(0).getFSDataset().getFsVolumeReferences().get(0);
   
volume.getBlockPoolSlice(volume.getBlockPoolList()[0]).setDeleteDuplicateReplicasForTesting(true);
   scanner.start();
   scanner.run();
   ```
   
   
   What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 649668)
Time Spent: 4h 20m  (was: 4h 10m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org