[jira] [Resolved] (HDFS-16878) TestLeaseRecovery2 timeouts

2022-12-29 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16878.
--
Resolution: Duplicate

Dup of HDFS-16853. Closing.

> TestLeaseRecovery2 timeouts
> ---
>
> Key: HDFS-16878
> URL: https://issues.apache.org/jira/browse/HDFS-16878
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Priority: Major
>
> The following tests in TestLeaseRecover2 timeouts
>  * testHardLeaseRecoveryAfterNameNodeRestart
>  * testHardLeaseRecoveryAfterNameNodeRestart2
>  * testHardLeaseRecoveryWithRenameAfterNameNodeRestart
> {noformat}
> [ERROR] Tests run: 8, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 
> 139.044 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestLeaseRecovery2
> [ERROR] 
> testHardLeaseRecoveryAfterNameNodeRestart(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 30.47 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 3 
> milliseconds
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2831)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2880)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:594)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart(TestLeaseRecovery2.java:498)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:750) {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16878) TestLeaseRecovery2 timeouts

2022-12-29 Thread Akira Ajisaka (Jira)
Akira Ajisaka created HDFS-16878:


 Summary: TestLeaseRecovery2 timeouts
 Key: HDFS-16878
 URL: https://issues.apache.org/jira/browse/HDFS-16878
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Akira Ajisaka


The following tests in TestLeaseRecover2 timeouts
 * testHardLeaseRecoveryAfterNameNodeRestart

 * testHardLeaseRecoveryAfterNameNodeRestart2

 * testHardLeaseRecoveryWithRenameAfterNameNodeRestart

{noformat}
[ERROR] Tests run: 8, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 139.044 
s <<< FAILURE! - in org.apache.hadoop.hdfs.TestLeaseRecovery2
[ERROR] 
testHardLeaseRecoveryAfterNameNodeRestart(org.apache.hadoop.hdfs.TestLeaseRecovery2)
  Time elapsed: 30.47 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 3 
milliseconds
at java.lang.Thread.sleep(Native Method)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2831)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2880)
at 
org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:594)
at 
org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart(TestLeaseRecovery2.java:498)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:750) {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16633) Reserved Space For Replicas is not released on some cases

2022-11-30 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16633:
-
Fix Version/s: 3.3.9

Backported to branch-3.3.

> Reserved Space For Replicas is not released on some cases
> -
>
> Key: HDFS-16633
> URL: https://issues.apache.org/jira/browse/HDFS-16633
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: Prabhu Joseph
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Have found the Reserved Space For Replicas is not released on some cases in a 
> Cx Prod cluster. There are few fixes like HDFS-9530 and HDFS-8072 but still 
> the issue is not completely fixed. Have tried to debug the root cause but 
> this will take lot of time as it is Cx Prod Cluster. 
> But we have an easier way to fix the issue completely by releasing any 
> remaining reserved space of the Replica from the Volume. 
> DataXceiver#writeBlock finally will call BlockReceiver#close which will check 
> if the ReplicaInfo has any remaining reserved space, if so release from the 
> Volume.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16633) Reserved Space For Replicas is not released on some cases

2022-11-20 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636503#comment-17636503
 ] 

Akira Ajisaka commented on HDFS-16633:
--

[~prabhujoseph] [~groot] do we need to backport to branch-3.3?

> Reserved Space For Replicas is not released on some cases
> -
>
> Key: HDFS-16633
> URL: https://issues.apache.org/jira/browse/HDFS-16633
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: Prabhu Joseph
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Have found the Reserved Space For Replicas is not released on some cases in a 
> Cx Prod cluster. There are few fixes like HDFS-9530 and HDFS-8072 but still 
> the issue is not completely fixed. Have tried to debug the root cause but 
> this will take lot of time as it is Cx Prod Cluster. 
> But we have an easier way to fix the issue completely by releasing any 
> remaining reserved space of the Replica from the Volume. 
> DataXceiver#writeBlock finally will call BlockReceiver#close which will check 
> if the ReplicaInfo has any remaining reserved space, if so release from the 
> Volume.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16628) RBF: Correct target directory when move to trash for kerberos login user.

2022-10-11 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16628:
-
Fix Version/s: 3.3.9

Merged https://github.com/apache/hadoop/pull/4974 into branch-3.3.

> RBF: Correct target directory when move to trash for kerberos login user.
> -
>
> Key: HDFS-16628
> URL: https://issues.apache.org/jira/browse/HDFS-16628
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiping Zhang
>Assignee: Xiping Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> remove data from the router will fail using such a user 
> username/d...@hadoop.com



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16024) RBF: Rename data to the Trash should be based on src locations

2022-10-09 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16024:
-
Fix Version/s: 3.3.9

Merged https://github.com/apache/hadoop/pull/4962 into branch-3.3.

> RBF: Rename data to the Trash should be based on src locations
> --
>
> Key: HDFS-16024
> URL: https://issues.apache.org/jira/browse/HDFS-16024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> 1.When deleting data to the Trash without configuring a mount point for the 
> Trash, the Router should recognize and move the data to the Trash
> 2.When the user’s trash can is configured with a mount point and is different 
> from the NS of the deleted directory, the router should identify and move the 
> data to the trash can of the current user of src
> The same is true for using ViewFs mount points, I think we should be 
> consistent with it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-3570) Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used space

2022-10-01 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-3570:
---

Assignee: (was: Akira Ajisaka)

> Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used 
> space
> 
>
> Key: HDFS-3570
> URL: https://issues.apache.org/jira/browse/HDFS-3570
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Priority: Minor
> Attachments: HDFS-3570.003.patch, HDFS-3570.2.patch, 
> HDFS-3570.aash.1.patch
>
>
> Report from a user here: 
> https://groups.google.com/a/cloudera.org/d/msg/cdh-user/pIhNyDVxdVY/b7ENZmEvBjIJ,
>  post archived at http://pastebin.com/eVFkk0A0
> This user had a specific DN that had a large non-DFS usage among 
> dfs.data.dirs, and very little DFS usage (which is computed against total 
> possible capacity). 
> Balancer apparently only looks at the usage, and ignores to consider that 
> non-DFS usage may also be high on a DN/cluster. Hence, it thinks that if a 
> DFS Usage report from DN is 8% only, its got a lot of free space to write 
> more blocks, when that isn't true as shown by the case of this user. It went 
> on scheduling writes to the DN to balance it out, but the DN simply can't 
> accept any more blocks as a result of its disks' state.
> I think it would be better if we _computed_ the actual utilization based on 
> {{(100-(actual remaining space))/(capacity)}}, as opposed to the current 
> {{(dfs used)/(capacity)}}. Thoughts?
> This isn't very critical, however, cause it is very rare to see DN space 
> being used for non DN data, but it does expose a valid bug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-6310) PBImageXmlWriter should output information about Delegation Tokens

2022-10-01 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-6310:
---

Assignee: (was: Akira Ajisaka)

> PBImageXmlWriter should output information about Delegation Tokens
> --
>
> Key: HDFS-6310
> URL: https://issues.apache.org/jira/browse/HDFS-6310
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.4.0
>Reporter: Akira Ajisaka
>Priority: Major
> Attachments: HDFS-6310.patch
>
>
> Separated from HDFS-6293.
> The 2.4.0 pb-fsimage does contain tokens, but OfflineImageViewer with -XML 
> option does not show any tokens.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-8944) Make dfsadmin command options case insensitive

2022-10-01 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-8944:
---

Assignee: (was: Akira Ajisaka)

> Make dfsadmin command options case insensitive
> --
>
> Key: HDFS-8944
> URL: https://issues.apache.org/jira/browse/HDFS-8944
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Akira Ajisaka
>Priority: Major
> Attachments: HDFS-8944.001.patch, HDFS-8944.002.patch
>
>
> Now dfsadmin command options are case sensitive except allowSnapshot and  
> disallowSnapshot. It would be better to make them case insensitive for 
> usability and consistency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-5156) SafeModeTime metrics sometimes includes non-Safemode time.

2022-10-01 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-5156:
---

Assignee: (was: Akira Ajisaka)

> SafeModeTime metrics sometimes includes non-Safemode time.
> --
>
> Key: HDFS-5156
> URL: https://issues.apache.org/jira/browse/HDFS-5156
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.1.0-beta, 1.2.1
>Reporter: Akira Ajisaka
>Priority: Major
>  Labels: BB2015-05-TBR, metrics
> Attachments: HDFS-5156.2.patch, HDFS-5156.patch
>
>
> SafeModeTime metrics shows duration in safe mode startup. However, this 
> metrics is set to the time from FSNameSystem starts whenever safe mode 
> leaves. In the result, executing "hdfs dfsadmin -safemode enter" and "hdfs 
> dfsadmin -safemode leave", the metrics includes non-Safemode time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-5361) Change the unit of StartupProgress 'PercentComplete' to percentage

2022-10-01 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-5361:
---

Assignee: (was: Akira Ajisaka)

> Change the unit of StartupProgress 'PercentComplete' to percentage
> --
>
> Key: HDFS-5361
> URL: https://issues.apache.org/jira/browse/HDFS-5361
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.1.0-beta
>Reporter: Akira Ajisaka
>Priority: Minor
>  Labels: BB2015-05-TBR, metrics, newbie
> Attachments: HDFS-5361.2.patch, HDFS-5361.3.patch, HDFS-5361.patch
>
>
> Now the unit of 'PercentComplete' metrics is rate (maximum is 1.0). It's 
> confusing for users because its name includes "percent".
> The metrics should be multiplied by 100.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16766) XML External Entity (XXE) attacks can occur while processing XML received from an untrusted source

2022-09-27 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16766.
--
Fix Version/s: 3.4.0
   3.3.9
   3.2.5
   Resolution: Fixed

Committed to trunk, branch-3.3, and branch-3.2. Thank you [~Du] for your report 
and thank you [~groot] for your fix!

> XML External Entity (XXE) attacks can occur while processing XML received 
> from an untrusted source
> --
>
> Key: HDFS-16766
> URL: https://issues.apache.org/jira/browse/HDFS-16766
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.3.4
>Reporter: Jing
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9, 3.2.5
>
>
> XML External Entity (XXE) attacks can occur when an XML parser supports XML 
> entities while processing XML received from an untrusted source. The attack 
> resides in XML input containing references to an external entity an is parsed 
> by the weakly configured javax.xml.parsers.DocumentBuilder XML parser.
>  
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/ECPolicyLoader.java#L93



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16729) RBF: fix some unreasonably annotated docs

2022-08-20 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16729.
--
Fix Version/s: 3.4.0
   3.3.9
   Resolution: Fixed

Committed to trunk and branch-3.3. Thank you [~jianghuazhu] for your 
contribution!

> RBF: fix some unreasonably annotated docs
> -
>
> Key: HDFS-16729
> URL: https://issues.apache.org/jira/browse/HDFS-16729
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, rbf
>Affects Versions: 3.3.3
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
> Attachments: image-2022-08-16-14-19-07-630.png
>
>
> I found some unreasonably annotated documentation here. E.g:
>  !image-2022-08-16-14-19-07-630.png! 
> It should be our job to make these annotations cleaner.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16729) RBF: fix some unreasonably annotated docs

2022-08-20 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16729:
-
Issue Type: Bug  (was: Improvement)

> RBF: fix some unreasonably annotated docs
> -
>
> Key: HDFS-16729
> URL: https://issues.apache.org/jira/browse/HDFS-16729
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, rbf
>Affects Versions: 3.3.3
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-08-16-14-19-07-630.png
>
>
> I found some unreasonably annotated documentation here. E.g:
>  !image-2022-08-16-14-19-07-630.png! 
> It should be our job to make these annotations cleaner.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16731) hadoop-minikdc dependency duplicated in hadoop-yarn-server-nodemanager pom.xml

2022-08-20 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582388#comment-17582388
 ] 

Akira Ajisaka commented on HDFS-16731:
--

Note: You can move an issue into another project.

More -> Move

> hadoop-minikdc dependency duplicated in hadoop-yarn-server-nodemanager pom.xml
> --
>
> Key: HDFS-16731
> URL: https://issues.apache.org/jira/browse/HDFS-16731
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Clara Fang
>Priority: Minor
>  Labels: pull-request-available
>
> The dependency hadoop-minikdc is defined twice in 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/pom.xml
> {code:xml}
> 
> org.apache.hadoop
> hadoop-minikdc
> test
> 
> 
> org.apache.hadoop
> hadoop-minikdc
> test
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16064) Determine when to invalidate corrupt replicas based on number of usable replicas

2022-06-19 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16064:
-
Fix Version/s: 3.2.4

Cherry-picked to branch-3.2.

> Determine when to invalidate corrupt replicas based on number of usable 
> replicas
> 
>
> Key: HDFS-16064
> URL: https://issues.apache.org/jira/browse/HDFS-16064
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.2.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.4
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Seems that https://issues.apache.org/jira/browse/HDFS-721 was resolved as a 
> non-issue under the assumption that if the namenode & a datanode get into an 
> inconsistent state for a given block pipeline, there should be another 
> datanode available to replicate the block to
> While testing datanode decommissioning using "dfs.exclude.hosts", I have 
> encountered a scenario where the decommissioning gets stuck indefinitely
> Below is the progression of events:
>  * there are initially 4 datanodes DN1, DN2, DN3, DN4
>  * scale-down is started by adding DN1 & DN2 to "dfs.exclude.hosts"
>  * HDFS block pipelines on DN1 & DN2 must now be replicated to DN3 & DN4 in 
> order to satisfy their minimum replication factor of 2
>  * during this replication process 
> https://issues.apache.org/jira/browse/HDFS-721 is encountered which causes 
> the following inconsistent state:
>  ** DN3 thinks it has the block pipeline in FINALIZED state
>  ** the namenode does not think DN3 has the block pipeline
> {code:java}
> 2021-06-06 10:38:23,604 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client  at /DN2:45654 [Receiving block BP-YYY:blk_XXX]): 
> DN3:9866:DataXceiver error processing WRITE_BLOCK operation  src: /DN2:45654 
> dst: /DN3:9866; 
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
> BP-YYY:blk_XXX already exists in state FINALIZED and thus cannot be created.
> {code}
>  * the replication is attempted again, but:
>  ** DN4 has the block
>  ** DN1 and/or DN2 have the block, but don't count towards the minimum 
> replication factor because they are being decommissioned
>  ** DN3 does not have the block & cannot have the block replicated to it 
> because of HDFS-721
>  * the namenode repeatedly tries to replicate the block to DN3 & repeatedly 
> fails, this continues indefinitely
>  * therefore DN4 is the only live datanode with the block & the minimum 
> replication factor of 2 cannot be satisfied
>  * because the minimum replication factor cannot be satisfied for the 
> block(s) being moved off DN1 & DN2, the datanode decommissioning can never be 
> completed 
> {code:java}
> 2021-06-06 10:39:10,106 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN1:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> ...
> 2021-06-06 10:57:10,105 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN2:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> {code}
> Being stuck in decommissioning state forever is not an intended behavior of 
> DataNode decommissioning
> A few potential solutions:
>  * Address the root cause of the problem which is an inconsistent state 
> between namenode & datanode: https://issues.apache.org/jira/browse/HDFS-721
>  * Detect when datanode decommissioning is stuck due to lack of available 
> datanodes for satisfying the minimum replication factor, then recover by 
> re-enabling the datanodes being decommissioned
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16064) Determine when to invalidate corrupt replicas based on number of usable replicas

2022-06-19 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16064:
-
Summary: Determine when to invalidate corrupt replicas based on number of 
usable replicas  (was: HDFS-721 causes DataNode decommissioning to get stuck 
indefinitely)

> Determine when to invalidate corrupt replicas based on number of usable 
> replicas
> 
>
> Key: HDFS-16064
> URL: https://issues.apache.org/jira/browse/HDFS-16064
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.2.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Seems that https://issues.apache.org/jira/browse/HDFS-721 was resolved as a 
> non-issue under the assumption that if the namenode & a datanode get into an 
> inconsistent state for a given block pipeline, there should be another 
> datanode available to replicate the block to
> While testing datanode decommissioning using "dfs.exclude.hosts", I have 
> encountered a scenario where the decommissioning gets stuck indefinitely
> Below is the progression of events:
>  * there are initially 4 datanodes DN1, DN2, DN3, DN4
>  * scale-down is started by adding DN1 & DN2 to "dfs.exclude.hosts"
>  * HDFS block pipelines on DN1 & DN2 must now be replicated to DN3 & DN4 in 
> order to satisfy their minimum replication factor of 2
>  * during this replication process 
> https://issues.apache.org/jira/browse/HDFS-721 is encountered which causes 
> the following inconsistent state:
>  ** DN3 thinks it has the block pipeline in FINALIZED state
>  ** the namenode does not think DN3 has the block pipeline
> {code:java}
> 2021-06-06 10:38:23,604 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client  at /DN2:45654 [Receiving block BP-YYY:blk_XXX]): 
> DN3:9866:DataXceiver error processing WRITE_BLOCK operation  src: /DN2:45654 
> dst: /DN3:9866; 
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
> BP-YYY:blk_XXX already exists in state FINALIZED and thus cannot be created.
> {code}
>  * the replication is attempted again, but:
>  ** DN4 has the block
>  ** DN1 and/or DN2 have the block, but don't count towards the minimum 
> replication factor because they are being decommissioned
>  ** DN3 does not have the block & cannot have the block replicated to it 
> because of HDFS-721
>  * the namenode repeatedly tries to replicate the block to DN3 & repeatedly 
> fails, this continues indefinitely
>  * therefore DN4 is the only live datanode with the block & the minimum 
> replication factor of 2 cannot be satisfied
>  * because the minimum replication factor cannot be satisfied for the 
> block(s) being moved off DN1 & DN2, the datanode decommissioning can never be 
> completed 
> {code:java}
> 2021-06-06 10:39:10,106 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN1:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> ...
> 2021-06-06 10:57:10,105 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN2:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> {code}
> Being stuck in decommissioning state forever is not an intended behavior of 
> DataNode decommissioning
> A few potential solutions:
>  * Address the root cause of the problem which is an inconsistent state 
> between namenode & datanode: https://issues.apache.org/jira/browse/HDFS-721
>  * Detect when datanode decommissioning is stuck due to lack of available 
> datanodes for satisfying the minimum replication factor, then recover by 
> re-enabling the datanodes being decommissioned
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16064) HDFS-721 causes DataNode decommissioning to get stuck indefinitely

2022-06-19 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16064.
--
Fix Version/s: 3.4.0
   3.3.4
   Resolution: Fixed

Merged the PR into trunk and branch-3.3.

> HDFS-721 causes DataNode decommissioning to get stuck indefinitely
> --
>
> Key: HDFS-16064
> URL: https://issues.apache.org/jira/browse/HDFS-16064
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.2.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Seems that https://issues.apache.org/jira/browse/HDFS-721 was resolved as a 
> non-issue under the assumption that if the namenode & a datanode get into an 
> inconsistent state for a given block pipeline, there should be another 
> datanode available to replicate the block to
> While testing datanode decommissioning using "dfs.exclude.hosts", I have 
> encountered a scenario where the decommissioning gets stuck indefinitely
> Below is the progression of events:
>  * there are initially 4 datanodes DN1, DN2, DN3, DN4
>  * scale-down is started by adding DN1 & DN2 to "dfs.exclude.hosts"
>  * HDFS block pipelines on DN1 & DN2 must now be replicated to DN3 & DN4 in 
> order to satisfy their minimum replication factor of 2
>  * during this replication process 
> https://issues.apache.org/jira/browse/HDFS-721 is encountered which causes 
> the following inconsistent state:
>  ** DN3 thinks it has the block pipeline in FINALIZED state
>  ** the namenode does not think DN3 has the block pipeline
> {code:java}
> 2021-06-06 10:38:23,604 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client  at /DN2:45654 [Receiving block BP-YYY:blk_XXX]): 
> DN3:9866:DataXceiver error processing WRITE_BLOCK operation  src: /DN2:45654 
> dst: /DN3:9866; 
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
> BP-YYY:blk_XXX already exists in state FINALIZED and thus cannot be created.
> {code}
>  * the replication is attempted again, but:
>  ** DN4 has the block
>  ** DN1 and/or DN2 have the block, but don't count towards the minimum 
> replication factor because they are being decommissioned
>  ** DN3 does not have the block & cannot have the block replicated to it 
> because of HDFS-721
>  * the namenode repeatedly tries to replicate the block to DN3 & repeatedly 
> fails, this continues indefinitely
>  * therefore DN4 is the only live datanode with the block & the minimum 
> replication factor of 2 cannot be satisfied
>  * because the minimum replication factor cannot be satisfied for the 
> block(s) being moved off DN1 & DN2, the datanode decommissioning can never be 
> completed 
> {code:java}
> 2021-06-06 10:39:10,106 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN1:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> ...
> 2021-06-06 10:57:10,105 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN2:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> {code}
> Being stuck in decommissioning state forever is not an intended behavior of 
> DataNode decommissioning
> A few potential solutions:
>  * Address the root cause of the problem which is an inconsistent state 
> between namenode & datanode: https://issues.apache.org/jira/browse/HDFS-721
>  * Detect when datanode decommissioning is stuck due to lack of available 
> datanodes for satisfying the minimum replication factor, then recover by 
> re-enabling the datanodes being decommissioned
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16064) HDFS-721 causes DataNode decommissioning to get stuck indefinitely

2022-06-19 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-16064:


Assignee: Kevin Wikant

> HDFS-721 causes DataNode decommissioning to get stuck indefinitely
> --
>
> Key: HDFS-16064
> URL: https://issues.apache.org/jira/browse/HDFS-16064
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.2.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Seems that https://issues.apache.org/jira/browse/HDFS-721 was resolved as a 
> non-issue under the assumption that if the namenode & a datanode get into an 
> inconsistent state for a given block pipeline, there should be another 
> datanode available to replicate the block to
> While testing datanode decommissioning using "dfs.exclude.hosts", I have 
> encountered a scenario where the decommissioning gets stuck indefinitely
> Below is the progression of events:
>  * there are initially 4 datanodes DN1, DN2, DN3, DN4
>  * scale-down is started by adding DN1 & DN2 to "dfs.exclude.hosts"
>  * HDFS block pipelines on DN1 & DN2 must now be replicated to DN3 & DN4 in 
> order to satisfy their minimum replication factor of 2
>  * during this replication process 
> https://issues.apache.org/jira/browse/HDFS-721 is encountered which causes 
> the following inconsistent state:
>  ** DN3 thinks it has the block pipeline in FINALIZED state
>  ** the namenode does not think DN3 has the block pipeline
> {code:java}
> 2021-06-06 10:38:23,604 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client  at /DN2:45654 [Receiving block BP-YYY:blk_XXX]): 
> DN3:9866:DataXceiver error processing WRITE_BLOCK operation  src: /DN2:45654 
> dst: /DN3:9866; 
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
> BP-YYY:blk_XXX already exists in state FINALIZED and thus cannot be created.
> {code}
>  * the replication is attempted again, but:
>  ** DN4 has the block
>  ** DN1 and/or DN2 have the block, but don't count towards the minimum 
> replication factor because they are being decommissioned
>  ** DN3 does not have the block & cannot have the block replicated to it 
> because of HDFS-721
>  * the namenode repeatedly tries to replicate the block to DN3 & repeatedly 
> fails, this continues indefinitely
>  * therefore DN4 is the only live datanode with the block & the minimum 
> replication factor of 2 cannot be satisfied
>  * because the minimum replication factor cannot be satisfied for the 
> block(s) being moved off DN1 & DN2, the datanode decommissioning can never be 
> completed 
> {code:java}
> 2021-06-06 10:39:10,106 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN1:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> ...
> 2021-06-06 10:57:10,105 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN2:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> {code}
> Being stuck in decommissioning state forever is not an intended behavior of 
> DataNode decommissioning
> A few potential solutions:
>  * Address the root cause of the problem which is an inconsistent state 
> between namenode & datanode: https://issues.apache.org/jira/browse/HDFS-721
>  * Detect when datanode decommissioning is stuck due to lack of available 
> datanodes for satisfying the minimum replication factor, then recover by 
> re-enabling the datanodes being decommissioned
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16635) Fix javadoc error in Java 11

2022-06-17 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17555393#comment-17555393
 ] 

Akira Ajisaka commented on HDFS-16635:
--

I think we can remove the link to NameNode instead of importing NameNode class.

> Fix javadoc error in Java 11
> 
>
> Key: HDFS-16635
> URL: https://issues.apache.org/jira/browse/HDFS-16635
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build, documentation
>Reporter: Akira Ajisaka
>Priority: Major
>
> Javadoc build in Java 11 fails.
> {noformat}
> [ERROR] 
> /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4410/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/startupprogress/package-info.java:20:
>  error: reference not found
> [ERROR]  * This package provides a mechanism for tracking {@link NameNode} 
> startup
> {noformat}
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4410/2/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16635) Fix javadoc error in Java 11

2022-06-17 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16635:
-
Labels: newbie  (was: )

> Fix javadoc error in Java 11
> 
>
> Key: HDFS-16635
> URL: https://issues.apache.org/jira/browse/HDFS-16635
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build, documentation
>Reporter: Akira Ajisaka
>Priority: Major
>  Labels: newbie
>
> Javadoc build in Java 11 fails.
> {noformat}
> [ERROR] 
> /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4410/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/startupprogress/package-info.java:20:
>  error: reference not found
> [ERROR]  * This package provides a mechanism for tracking {@link NameNode} 
> startup
> {noformat}
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4410/2/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16576) Remove unused imports in HDFS project

2022-06-17 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17555392#comment-17555392
 ] 

Akira Ajisaka commented on HDFS-16576:
--

Hi [~groot]
This commit broke HDFS-16635. Sorry I missed the error while reviewing this PR. 
Could you fix it?

> Remove unused imports in HDFS project
> -
>
> Key: HDFS-16576
> URL: https://issues.apache.org/jira/browse/HDFS-16576
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ashutosh Gupta
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> h3. Optimize Imports to keep code clean
>  # Remove any unused imports



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16635) Fix javadoc error in Java 11

2022-06-17 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16635:
-
Target Version/s: 3.4.0, 3.3.4  (was: 3.4.0)

> Fix javadoc error in Java 11
> 
>
> Key: HDFS-16635
> URL: https://issues.apache.org/jira/browse/HDFS-16635
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build, documentation
>Reporter: Akira Ajisaka
>Priority: Major
>
> Javadoc build in Java 11 fails.
> {noformat}
> [ERROR] 
> /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4410/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/startupprogress/package-info.java:20:
>  error: reference not found
> [ERROR]  * This package provides a mechanism for tracking {@link NameNode} 
> startup
> {noformat}
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4410/2/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16635) Fix javadoc error in Java 11

2022-06-17 Thread Akira Ajisaka (Jira)
Akira Ajisaka created HDFS-16635:


 Summary: Fix javadoc error in Java 11
 Key: HDFS-16635
 URL: https://issues.apache.org/jira/browse/HDFS-16635
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, documentation
Reporter: Akira Ajisaka


Javadoc build in Java 11 fails.

{noformat}
[ERROR] 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4410/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/startupprogress/package-info.java:20:
 error: reference not found
[ERROR]  * This package provides a mechanism for tracking {@link NameNode} 
startup
{noformat}

https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4410/2/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16576) Remove unused imports in HDFS project

2022-06-09 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16576:
-
Fix Version/s: 3.3.4

Backported to branch-3.3.

> Remove unused imports in HDFS project
> -
>
> Key: HDFS-16576
> URL: https://issues.apache.org/jira/browse/HDFS-16576
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ashutosh Gupta
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> h3. Optimize Imports to keep code clean
>  # Remove any unused imports



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16576) Remove unused imports in HDFS project

2022-06-09 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16576:
-
Summary: Remove unused imports in HDFS project  (was: Remove unused Imports 
in Hadoop HDFS project)

> Remove unused imports in HDFS project
> -
>
> Key: HDFS-16576
> URL: https://issues.apache.org/jira/browse/HDFS-16576
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ashutosh Gupta
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> h3. Optimize Imports to keep code clean
>  # Remove any unused imports



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16576) Remove unused Imports in Hadoop HDFS project

2022-06-09 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16576.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Committed to trunk.

> Remove unused Imports in Hadoop HDFS project
> 
>
> Key: HDFS-16576
> URL: https://issues.apache.org/jira/browse/HDFS-16576
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ashutosh Gupta
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> h3. Optimize Imports to keep code clean
>  # Remove any unused imports



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16608) Fix the link in TestClientProtocolForPipelineRecovery

2022-06-06 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16608.
--
Fix Version/s: 3.4.0
   3.3.4
   Resolution: Fixed

Committed to trunk and branch-3.3. Thank you [~samrat007] for your contribution.

> Fix the link in TestClientProtocolForPipelineRecovery
> -
>
> Key: HDFS-16608
> URL: https://issues.apache.org/jira/browse/HDFS-16608
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Samrat Deb
>Assignee: Samrat Deb
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16608) Fix the link in TestClientProtocolForPipelineRecovery

2022-06-06 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-16608:


Assignee: Samrat Deb

> Fix the link in TestClientProtocolForPipelineRecovery
> -
>
> Key: HDFS-16608
> URL: https://issues.apache.org/jira/browse/HDFS-16608
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Samrat Deb
>Assignee: Samrat Deb
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16608) Fix the link in TestClientProtocolForPipelineRecovery

2022-06-06 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16608:
-
Summary: Fix the link in TestClientProtocolForPipelineRecovery  (was: @Link 
in doc to private variable DataStreamer. pipelineRecoveryCount)

> Fix the link in TestClientProtocolForPipelineRecovery
> -
>
> Key: HDFS-16608
> URL: https://issues.apache.org/jira/browse/HDFS-16608
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Samrat Deb
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16608) @Link in doc to private variable DataStreamer. pipelineRecoveryCount

2022-06-06 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16608:
-
Component/s: documentation

> @Link in doc to private variable DataStreamer. pipelineRecoveryCount
> 
>
> Key: HDFS-16608
> URL: https://issues.apache.org/jira/browse/HDFS-16608
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Samrat Deb
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16604) Install gtest via FetchContent_Declare in CMake

2022-05-31 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16604.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Merged the PR into trunk.

> Install gtest via FetchContent_Declare in CMake
> ---
>
> Key: HDFS-16604
> URL: https://issues.apache.org/jira/browse/HDFS-16604
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs++
>Affects Versions: 3.4.0
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> CMake is unable to checkout *release-1.10.0* version of GoogleTest -
> {code}
> [WARNING] -- Build files have been written to: 
> /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4370/centos-7/src/hadoop-hdfs-project/hadoop-hdfs-native-client/target/main/native/libhdfspp/googletest-download
> [WARNING] Scanning dependencies of target googletest
> [WARNING] [ 11%] Creating directories for 'googletest'
> [WARNING] [ 22%] Performing download step (git clone) for 'googletest'
> [WARNING] Cloning into 'googletest-src'...
> [WARNING] fatal: invalid reference: release-1.10.0
> [WARNING] CMake Error at 
> googletest-download/googletest-prefix/tmp/googletest-gitclone.cmake:40 
> (message):
> [WARNING]   Failed to checkout tag: 'release-1.10.0'
> [WARNING] 
> [WARNING] 
> [WARNING] gmake[2]: *** [CMakeFiles/googletest.dir/build.make:111: 
> googletest-prefix/src/googletest-stamp/googletest-download] Error 1
> [WARNING] gmake[1]: *** [CMakeFiles/Makefile2:95: 
> CMakeFiles/googletest.dir/all] Error 2
> [WARNING] gmake: *** [Makefile:103: all] Error 2
> [WARNING] CMake Error at main/native/libhdfspp/CMakeLists.txt:68 (message):
> [WARNING]   Build step for googletest failed: 2
> {code}
> Jenkins run - 
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4370/6/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client.txt
> We need to use *FetchContent_Declare* since we're getting the source code 
> exactly at the given commit SHA. This avoids the checkout step altogether and 
> solves the above issue.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16453) Upgrade okhttp from 2.7.5 to 4.9.3

2022-05-20 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16453.
--
Fix Version/s: 3.4.0
   3.3.4
   Resolution: Fixed

Committed to trunk and branch-3.3. Thank you [~ivan.viaznikov] for your report 
and thank you [~groot] for your contribution!

> Upgrade okhttp from 2.7.5 to 4.9.3
> --
>
> Key: HDFS-16453
> URL: https://issues.apache.org/jira/browse/HDFS-16453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.3.1
>Reporter: Ivan Viaznikov
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {{org.apache.hadoop:hadoop-hdfs-client}} comes with 
> {{com.squareup.okhttp:okhttp:2.7.5}} as a dependency, which is vulnerable to 
> an information disclosure issue due to how the contents of sensitive headers, 
> such as the {{Authorization}} header, can be logged when an 
> {{IllegalArgumentException}} is thrown.
> This issue could allow an attacker or malicious user who has access to the 
> logs to obtain the sensitive contents of the affected headers which could 
> facilitate further attacks.
> Fixed in {{5.0.0-alpha3}} by 
> [this|https://github.com/square/okhttp/commit/dcc6483b7dc6d9c0b8e03ff7c30c13f3c75264a5]
>  commit. The fix was cherry-picked and backported into {{4.9.2}} with 
> [this|https://github.com/square/okhttp/commit/1fd7c0afdc2cee9ba982b07d49662af7f60e1518]
>  commit.
> Requesting you to clarify if this dependency will be updated to a fixed 
> version in the following releases



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16453) Upgrade okhttp from 2.7.5 to 4.9.3

2022-05-20 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16453:
-
Summary: Upgrade okhttp from 2.7.5 to 4.9.3  (was: okhttp vulnerable 
library update)

> Upgrade okhttp from 2.7.5 to 4.9.3
> --
>
> Key: HDFS-16453
> URL: https://issues.apache.org/jira/browse/HDFS-16453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.3.1
>Reporter: Ivan Viaznikov
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {{org.apache.hadoop:hadoop-hdfs-client}} comes with 
> {{com.squareup.okhttp:okhttp:2.7.5}} as a dependency, which is vulnerable to 
> an information disclosure issue due to how the contents of sensitive headers, 
> such as the {{Authorization}} header, can be logged when an 
> {{IllegalArgumentException}} is thrown.
> This issue could allow an attacker or malicious user who has access to the 
> logs to obtain the sensitive contents of the affected headers which could 
> facilitate further attacks.
> Fixed in {{5.0.0-alpha3}} by 
> [this|https://github.com/square/okhttp/commit/dcc6483b7dc6d9c0b8e03ff7c30c13f3c75264a5]
>  commit. The fix was cherry-picked and backported into {{4.9.2}} with 
> [this|https://github.com/square/okhttp/commit/1fd7c0afdc2cee9ba982b07d49662af7f60e1518]
>  commit.
> Requesting you to clarify if this dependency will be updated to a fixed 
> version in the following releases



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16552) Fix NPE for TestBlockManager

2022-05-13 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536771#comment-17536771
 ] 

Akira Ajisaka commented on HDFS-16552:
--

Hi [~tomscut] and [~tasanuma] - Would you fix the build failure in branch-3.2? 
Looks like this issue broke the build.
{code:java}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
(default-testCompile) on project hadoop-hdfs: Compilation failure: Compilation 
failure: 
[ERROR] 
/home/aajisaka/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java:[847,5]
 cannot find symbol
[ERROR]   symbol:   variable NameNode
[ERROR]   location: class 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager
[ERROR] 
/home/aajisaka/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java:[893,5]
 cannot find symbol
[ERROR]   symbol:   variable NameNode
[ERROR]   location: class 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager
[ERROR] -> [Help 1] {code}

> Fix NPE for TestBlockManager
> 
>
> Key: HDFS-16552
> URL: https://issues.apache.org/jira/browse/HDFS-16552
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.4
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> There is a NPE in BlockManager when run 
> TestBlockManager#testSkipReconstructionWithManyBusyNodes2. Because 
> NameNodeMetrics is not initialized in this unit test.
>  
> Related ci link, see 
> [this|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4209/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt].
> {code:java}
> [ERROR] Tests run: 34, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 30.088 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager
> [ERROR] 
> testSkipReconstructionWithManyBusyNodes2(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager)
>   Time elapsed: 2.783 s  <<< ERROR!
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.scheduleReconstruction(BlockManager.java:2171)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager.testSkipReconstructionWithManyBusyNodes2(TestBlockManager.java:947)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> 

[jira] [Resolved] (HDFS-16185) Fix comment in LowRedundancyBlocks.java

2022-05-07 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16185.
--
Fix Version/s: 3.4.0
   3.2.4
   3.3.4
   Resolution: Fixed

Committed to trunk, branch-3.3, and branch-3.2. Thank you [~groot] for your 
contribution.

> Fix comment in LowRedundancyBlocks.java
> ---
>
> Key: HDFS-16185
> URL: https://issues.apache.org/jira/browse/HDFS-16185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Akira Ajisaka
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: newbie, pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.4
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/hadoop/blob/c8e58648389c7b0b476c3d0d47be86af2966842f/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/LowRedundancyBlocks.java#L249]
> "can only afford one replica loss" is not correct there. Before HDFS-9857, 
> the comment is "there is less than a third as many blocks as requested; this 
> is considered very under-replicated" and it seems correct.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16255) RBF: Fix dead link to fedbalance document

2022-04-24 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16255.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Committed to trunk. Thank you [~groot] for your contribution.

> RBF: Fix dead link to fedbalance document
> -
>
> Key: HDFS-16255
> URL: https://issues.apache.org/jira/browse/HDFS-16255
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Akira Ajisaka
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: newbie, pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There is a dead link in HDFSRouterFederation.md 
> (https://github.com/apache/hadoop/blob/e90c41af34ada9d7b61e4d5a8b88c2f62c7fea25/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md?plain=1#L517)
> {{../../../hadoop-federation-balance/HDFSFederationBalance.md}} should be 
> {{../../hadoop-federation-balance/HDFSFederationBalance.md}}.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13983) TestOfflineImageViewer crashes in windows

2022-04-22 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-13983:
-
Fix Version/s: 3.2.4

Cherry-picked to branch-3.2.

> TestOfflineImageViewer crashes in windows
> -
>
> Key: HDFS-13983
> URL: https://issues.apache.org/jira/browse/HDFS-13983
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
> Fix For: 3.3.0, 3.2.4
>
> Attachments: HDFS-13893-with-patch-intellij-idea.JPG, 
> HDFS-13893-with-patch-mvn.JPG, 
> HDFS-13893-with-patch-without-sysout-close-intellij-idea.JPG, 
> HDFS-13893-without-patch-intellij-idea.JPG, HDFS-13893-without-patch-mvn.JPG, 
> HDFS-13983-01.patch, HDFS-13983-02.patch, HDFS-13983-03.patch
>
>
> TestOfflineImageViewer crashes in windows because, OfflineImageViewer 
> REVERSEXML tries to delete the outputfile and re-create the same stream which 
> is already created.
> Also there are unclosed RAF for input files which blocks from files being 
> deleted.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16546) Fix UT TestOfflineImageViewer#testReverseXmlWithoutSnapshotDiffSection to branch branch-3.2

2022-04-22 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16546:
-
Affects Version/s: 3.2.3
   (was: 3.2.0)

> Fix UT TestOfflineImageViewer#testReverseXmlWithoutSnapshotDiffSection to 
> branch branch-3.2
> ---
>
> Key: HDFS-16546
> URL: https://issues.apache.org/jira/browse/HDFS-16546
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.2.3
>Reporter: daimin
>Assignee: daimin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.4
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The test fails due to incorrect layoutVersion.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16546) Fix UT TestOfflineImageViewer#testReverseXmlWithoutSnapshotDiffSection to branch branch-3.2

2022-04-22 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16546.
--
Fix Version/s: 3.2.4
   Resolution: Fixed

Committed to branch-3.2. Thank you [~cndaimin] for your contribution!

> Fix UT TestOfflineImageViewer#testReverseXmlWithoutSnapshotDiffSection to 
> branch branch-3.2
> ---
>
> Key: HDFS-16546
> URL: https://issues.apache.org/jira/browse/HDFS-16546
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.2.0
>Reporter: daimin
>Assignee: daimin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.4
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The test fails due to incorrect layoutVersion.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16546) Fix UT TestOfflineImageViewer#testReverseXmlWithoutSnapshotDiffSection to branch branch-3.2

2022-04-22 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16546:
-
Priority: Major  (was: Minor)

> Fix UT TestOfflineImageViewer#testReverseXmlWithoutSnapshotDiffSection to 
> branch branch-3.2
> ---
>
> Key: HDFS-16546
> URL: https://issues.apache.org/jira/browse/HDFS-16546
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.2.0
>Reporter: daimin
>Assignee: daimin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The test fails due to incorrect layoutVersion.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16546) Fix UT TestOfflineImageViewer#testReverseXmlWithoutSnapshotDiffSection to branch branch-3.2

2022-04-22 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16546:
-
Issue Type: Bug  (was: Test)

> Fix UT TestOfflineImageViewer#testReverseXmlWithoutSnapshotDiffSection to 
> branch branch-3.2
> ---
>
> Key: HDFS-16546
> URL: https://issues.apache.org/jira/browse/HDFS-16546
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.2.0
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The test fails due to incorrect layoutVersion.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16035) Remove DummyGroupMapping as it is not longer used anywhere

2022-04-18 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16035.
--
Resolution: Fixed

Thank you [~vjasani] for your report and thank you [~groot] for your 
contribution.

> Remove DummyGroupMapping as it is not longer used anywhere
> --
>
> Key: HDFS-16035
> URL: https://issues.apache.org/jira/browse/HDFS-16035
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: httpfs, test
>Reporter: Viraj Jasani
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: beginner, newbie, pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> DummyGroupMapping class was added as part of HDFS-2657 and it was only used 
> in TestHttpFSServer as httpfs.groups.hadoop.security.group.mapping. However, 
> TestHttpFSServer is no longer using DummyGroupMapping and hence, it can be 
> removed completely as it is not used anywhere.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16035) Remove DummyGroupMapping as it is not longer used anywhere

2022-04-18 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16035:
-
Fix Version/s: 3.4.0
  Summary: Remove DummyGroupMapping as it is not longer used anywhere  
(was: Remove DummyGroupMapping)

Committed to trunk.

> Remove DummyGroupMapping as it is not longer used anywhere
> --
>
> Key: HDFS-16035
> URL: https://issues.apache.org/jira/browse/HDFS-16035
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: httpfs, test
>Reporter: Viraj Jasani
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: beginner, newbie, pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> DummyGroupMapping class was added as part of HDFS-2657 and it was only used 
> in TestHttpFSServer as httpfs.groups.hadoop.security.group.mapping. However, 
> TestHttpFSServer is no longer using DummyGroupMapping and hence, it can be 
> removed completely as it is not used anywhere.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16035) Remove DummyGroupMapping

2022-04-18 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16035:
-
Component/s: httpfs
 test

> Remove DummyGroupMapping
> 
>
> Key: HDFS-16035
> URL: https://issues.apache.org/jira/browse/HDFS-16035
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: httpfs, test
>Reporter: Viraj Jasani
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: beginner, newbie, pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> DummyGroupMapping class was added as part of HDFS-2657 and it was only used 
> in TestHttpFSServer as httpfs.groups.hadoop.security.group.mapping. However, 
> TestHttpFSServer is no longer using DummyGroupMapping and hence, it can be 
> removed completely as it is not used anywhere.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16536) TestOfflineImageViewer fails on branch-3.3

2022-04-18 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16536.
--
Fix Version/s: 3.3.4
   Resolution: Fixed

Committed to branch-3.3. Thank you [~groot] 

> TestOfflineImageViewer fails on branch-3.3
> --
>
> Key: HDFS-16536
> URL: https://issues.apache.org/jira/browse/HDFS-16536
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 3.3.4
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The NameNodeLayoutVersion -67 is not supported in Hadoop 3.3.x, so we need to 
> downgrade the version in the XML to -66.
> {code:java}
> [INFO] Running 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer
> [ERROR] Tests run: 27, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 7.918 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer
> [ERROR] 
> testReverseXmlWithoutSnapshotDiffSection(org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer)
>   Time elapsed: 0.009 s  <<< ERROR!
> java.io.IOException: Layout version mismatch.  This oiv tool handles layout 
> version -66, but the XML file has  -67.  Please either 
> re-generate the XML file with the proper layout version, or manually edit the 
> XML file to be usable with this version of the oiv tool.
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.readVersion(OfflineImageReconstructor.java:1699)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.processXml(OfflineImageReconstructor.java:1753)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.run(OfflineImageReconstructor.java:1846)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testReverseXmlWithoutSnapshotDiffSection(TestOfflineImageViewer.java:1209)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HDFS-16536) TestOfflineImageViewer fails on branch-3.3

2022-04-18 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16536:
-
Target Version/s: 3.3.4  (was: 3.2.4, 3.3.4)

> TestOfflineImageViewer fails on branch-3.3
> --
>
> Key: HDFS-16536
> URL: https://issues.apache.org/jira/browse/HDFS-16536
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: newbie, pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The NameNodeLayoutVersion -67 is not supported in Hadoop 3.3.x, so we need to 
> downgrade the version in the XML to -66.
> {code:java}
> [INFO] Running 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer
> [ERROR] Tests run: 27, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 7.918 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer
> [ERROR] 
> testReverseXmlWithoutSnapshotDiffSection(org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer)
>   Time elapsed: 0.009 s  <<< ERROR!
> java.io.IOException: Layout version mismatch.  This oiv tool handles layout 
> version -66, but the XML file has  -67.  Please either 
> re-generate the XML file with the proper layout version, or manually edit the 
> XML file to be usable with this version of the oiv tool.
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.readVersion(OfflineImageReconstructor.java:1699)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.processXml(OfflineImageReconstructor.java:1753)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.run(OfflineImageReconstructor.java:1846)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testReverseXmlWithoutSnapshotDiffSection(TestOfflineImageViewer.java:1209)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: 

[jira] [Assigned] (HDFS-16255) RBF: Fix dead link to fedbalance document

2022-04-17 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-16255:


Assignee: Ashutosh Gupta

> RBF: Fix dead link to fedbalance document
> -
>
> Key: HDFS-16255
> URL: https://issues.apache.org/jira/browse/HDFS-16255
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Akira Ajisaka
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: newbie
>
> There is a dead link in HDFSRouterFederation.md 
> (https://github.com/apache/hadoop/blob/e90c41af34ada9d7b61e4d5a8b88c2f62c7fea25/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md?plain=1#L517)
> {{../../../hadoop-federation-balance/HDFSFederationBalance.md}} should be 
> {{../../hadoop-federation-balance/HDFSFederationBalance.md}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16255) RBF: Fix dead link to fedbalance document

2022-04-17 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523518#comment-17523518
 ] 

Akira Ajisaka commented on HDFS-16255:
--

[~groot] - Yes, you can.

> RBF: Fix dead link to fedbalance document
> -
>
> Key: HDFS-16255
> URL: https://issues.apache.org/jira/browse/HDFS-16255
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Akira Ajisaka
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: newbie
>
> There is a dead link in HDFSRouterFederation.md 
> (https://github.com/apache/hadoop/blob/e90c41af34ada9d7b61e4d5a8b88c2f62c7fea25/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md?plain=1#L517)
> {{../../../hadoop-federation-balance/HDFSFederationBalance.md}} should be 
> {{../../hadoop-federation-balance/HDFSFederationBalance.md}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16536) TestOfflineImageViewer fails on branch-3.3

2022-04-10 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16536:
-
Target Version/s: 3.2.4, 3.3.3  (was: 3.3.3)

> TestOfflineImageViewer fails on branch-3.3
> --
>
> Key: HDFS-16536
> URL: https://issues.apache.org/jira/browse/HDFS-16536
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Akira Ajisaka
>Priority: Major
>  Labels: newbie
>
> The NameNodeLayoutVersion -67 is not supported in Hadoop 3.3.x, so we need to 
> downgrade the version in the XML to -66.
> {code:java}
> [INFO] Running 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer
> [ERROR] Tests run: 27, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 7.918 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer
> [ERROR] 
> testReverseXmlWithoutSnapshotDiffSection(org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer)
>   Time elapsed: 0.009 s  <<< ERROR!
> java.io.IOException: Layout version mismatch.  This oiv tool handles layout 
> version -66, but the XML file has  -67.  Please either 
> re-generate the XML file with the proper layout version, or manually edit the 
> XML file to be usable with this version of the oiv tool.
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.readVersion(OfflineImageReconstructor.java:1699)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.processXml(OfflineImageReconstructor.java:1753)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.run(OfflineImageReconstructor.java:1846)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testReverseXmlWithoutSnapshotDiffSection(TestOfflineImageViewer.java:1209)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16536) TestOfflineImageViewer fails on branch-3.3

2022-04-10 Thread Akira Ajisaka (Jira)
Akira Ajisaka created HDFS-16536:


 Summary: TestOfflineImageViewer fails on branch-3.3
 Key: HDFS-16536
 URL: https://issues.apache.org/jira/browse/HDFS-16536
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Akira Ajisaka


The NameNodeLayoutVersion -67 is not supported in Hadoop 3.3.x, so we need to 
downgrade the version in the XML to -66.
{code:java}
[INFO] Running 
org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer
[ERROR] Tests run: 27, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 7.918 
s <<< FAILURE! - in 
org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer
[ERROR] 
testReverseXmlWithoutSnapshotDiffSection(org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer)
  Time elapsed: 0.009 s  <<< ERROR!
java.io.IOException: Layout version mismatch.  This oiv tool handles layout 
version -66, but the XML file has  -67.  Please either 
re-generate the XML file with the proper layout version, or manually edit the 
XML file to be usable with this version of the oiv tool.
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.readVersion(OfflineImageReconstructor.java:1699)
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.processXml(OfflineImageReconstructor.java:1753)
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.run(OfflineImageReconstructor.java:1846)
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testReverseXmlWithoutSnapshotDiffSection(TestOfflineImageViewer.java:1209)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16522) Set Http and Ipc ports for Datanodes in MiniDFSCluster

2022-04-06 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16522:
-
Fix Version/s: 3.4.0
   3.3.3
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-3.3.

> Set Http and Ipc ports for Datanodes in MiniDFSCluster
> --
>
> Key: HDFS-16522
> URL: https://issues.apache.org/jira/browse/HDFS-16522
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.3
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> We should provide options to set Http and Ipc ports for Datanodes in 
> MiniDFSCluster.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16481) Provide support to set Http and Rpc ports in MiniJournalCluster

2022-04-06 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16481:
-
Fix Version/s: 3.3.3

Backported to branch-3.3 because HDFS-16522 requires this.

> Provide support to set Http and Rpc ports in MiniJournalCluster
> ---
>
> Key: HDFS-16481
> URL: https://issues.apache.org/jira/browse/HDFS-16481
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.3
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> We should provide support for clients to set Http and Rpc ports of 
> JournalNodes in MiniJournalCluster.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16529) Remove unnecessary setObserverRead in TestConsistentReadsObserver

2022-04-06 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16529:
-
Component/s: (was: s)

> Remove unnecessary setObserverRead in TestConsistentReadsObserver
> -
>
> Key: HDFS-16529
> URL: https://issues.apache.org/jira/browse/HDFS-16529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: wangzhaohui
>Assignee: wangzhaohui
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.2.4, 3.3.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16529) Remove unnecessary setObserverRead in TestConsistentReadsObserver

2022-04-06 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16529:
-
Component/s: s
 test

> Remove unnecessary setObserverRead in TestConsistentReadsObserver
> -
>
> Key: HDFS-16529
> URL: https://issues.apache.org/jira/browse/HDFS-16529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: s, test
>Reporter: wangzhaohui
>Assignee: wangzhaohui
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.2.4, 3.3.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16529) Remove unnecessary setObserverRead in TestConsistentReadsObserver

2022-04-06 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-16529:


Assignee: wangzhaohui

> Remove unnecessary setObserverRead in TestConsistentReadsObserver
> -
>
> Key: HDFS-16529
> URL: https://issues.apache.org/jira/browse/HDFS-16529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: wangzhaohui
>Assignee: wangzhaohui
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.2.4, 3.3.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16529) Remove unnecessary setObserverRead in TestConsistentReadsObserver

2022-04-06 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16529.
--
Fix Version/s: 3.4.0
   2.10.2
   3.2.4
   3.3.3
   Resolution: Fixed

Committed to trunk, branch-3.3, branch-3.2, and branch-2.10. Thank you 
[~wangzhaohui] for your contribution!

> Remove unnecessary setObserverRead in TestConsistentReadsObserver
> -
>
> Key: HDFS-16529
> URL: https://issues.apache.org/jira/browse/HDFS-16529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: wangzhaohui
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.2.4, 3.3.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16527) Add global timeout rule for TestRouterDistCpProcedure

2022-04-05 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16527.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Committed to trunk. Thank you [~tomscut] for your contribution.

> Add global timeout rule for TestRouterDistCpProcedure
> -
>
> Key: HDFS-16527
> URL: https://issues.apache.org/jira/browse/HDFS-16527
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> As [Ayush Saxena|https://github.com/ayushtkn] mentioned 
> [here|[https://github.com/apache/hadoop/pull/4009#pullrequestreview-925554297].]
>  TestRouterDistCpProcedure failed many times because of timeout. I will add a 
> global timeout rule for it. This makes it easy to set the timeout.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16518) KeyProviderCache close cached KeyProvider with Hadoop ShutdownHookManager

2022-03-28 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16518:
-
Fix Version/s: (was: 2.10.0)

> KeyProviderCache close cached KeyProvider with Hadoop ShutdownHookManager
> -
>
> Key: HDFS-16518
> URL: https://issues.apache.org/jira/browse/HDFS-16518
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.10.0
>Reporter: Lei Yang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> KeyProvider implements Closable interface but some custom implementation of 
> KeyProvider also needs explicit close in KeyProviderCache. An example is to 
> use custom KeyProvider in DFSClient to integrate read encrypted file on HDFS. 
> KeyProvider  currently gets closed in KeyProviderCache only when cache entry 
> is expired or invalidated. In some cases, this is not happening. This seems 
> related to guava cache.
> This patch is to use hadoop JVM shutdownhookManager to globally cleanup cache 
> entries and thus close KeyProvider using cache hook right after filesystem 
> instance gets closed in a deterministic way.
> {code:java}
> Class KeyProviderCache
> ...
>  public KeyProviderCache(long expiryMs) {
>   cache = CacheBuilder.newBuilder()
> .expireAfterAccess(expiryMs, TimeUnit.MILLISECONDS)
> .removalListener(new RemovalListener() {
>   @Override
>   public void onRemoval(
>   @Nonnull RemovalNotification notification) {
> try {
>   assert notification.getValue() != null;
>   notification.getValue().close();
> } catch (Throwable e) {
>   LOG.error(
>   "Error closing KeyProvider with uri ["
>   + notification.getKey() + "]", e);
> }
>   }
> })
> .build(); 
> }{code}
> We could have made a new function KeyProviderCache#close, have each DFSClient 
> call this function and close KeyProvider at the end of each DFSClient#close 
> call but it will expose another problem to potentially close global cache 
> among different DFSClient instances.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16355) Improve the description of dfs.block.scanner.volume.bytes.per.second

2022-03-27 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16355.
--
Fix Version/s: 3.4.0
   3.2.4
   3.3.3
   Resolution: Fixed

Committed to trunk, branch-3.3, and branch-3.2. Thank you [~philipse] for your 
contribution!

> Improve the description of dfs.block.scanner.volume.bytes.per.second
> 
>
> Key: HDFS-16355
> URL: https://issues.apache.org/jira/browse/HDFS-16355
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, hdfs
>Affects Versions: 3.3.1
>Reporter: guophilipse
>Assignee: guophilipse
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.3
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> datanode block scanner will be disabled if 
> `dfs.block.scanner.volume.bytes.per.second` is configured less then or equal 
> to zero, we can improve the desciption



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16355) Improve the description of dfs.block.scanner.volume.bytes.per.second

2022-03-27 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16355:
-
Component/s: documentation

> Improve the description of dfs.block.scanner.volume.bytes.per.second
> 
>
> Key: HDFS-16355
> URL: https://issues.apache.org/jira/browse/HDFS-16355
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, hdfs
>Affects Versions: 3.3.1
>Reporter: guophilipse
>Assignee: guophilipse
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> datanode block scanner will be disabled if 
> `dfs.block.scanner.volume.bytes.per.second` is configured less then or equal 
> to zero, we can improve the desciption



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16355) Improve the description of dfs.block.scanner.volume.bytes.per.second

2022-03-27 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16355:
-
Summary: Improve the description of 
dfs.block.scanner.volume.bytes.per.second  (was: Improve block scanner desc)

> Improve the description of dfs.block.scanner.volume.bytes.per.second
> 
>
> Key: HDFS-16355
> URL: https://issues.apache.org/jira/browse/HDFS-16355
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.1
>Reporter: guophilipse
>Assignee: guophilipse
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> datanode block scanner will be disabled if 
> `dfs.block.scanner.volume.bytes.per.second` is configured less then or equal 
> to zero, we can improve the desciption



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16355) Improve block scanner desc

2022-03-27 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-16355:


Assignee: guophilipse

> Improve block scanner desc
> --
>
> Key: HDFS-16355
> URL: https://issues.apache.org/jira/browse/HDFS-16355
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.1
>Reporter: guophilipse
>Assignee: guophilipse
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> datanode block scanner will be disabled if 
> `dfs.block.scanner.volume.bytes.per.second` is configured less then or equal 
> to zero, we can improve the desciption



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16523) Fix dependency error in hadoop-hdfs on M1 Mac

2022-03-26 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16523:
-
Summary: Fix dependency error in hadoop-hdfs on M1 Mac  (was: Fix 
dependency error in hadoop-hdfs)

> Fix dependency error in hadoop-hdfs on M1 Mac
> -
>
> Key: HDFS-16523
> URL: https://issues.apache.org/jira/browse/HDFS-16523
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
> Environment: M1 Pro Mac
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>
> hadoop-hdfs build is failing on docker with M1 Mac.
> {code}
> [WARNING]
> Dependency convergence error for
> org.fusesource.hawtjni:hawtjni-runtime:jar:1.11:provided paths to
> dependency are:
> +-org.apache.hadoop:hadoop-hdfs:jar:3.4.0-SNAPSHOT
>   +-org.openlabtesting.leveldbjni:leveldbjni-all:jar:1.8:compile
> +-org.openlabtesting.leveldbjni:leveldbjni:jar:1.8:provided
>   +-org.fusesource.hawtjni:hawtjni-runtime:jar:1.11:provided
> and
> +-org.apache.hadoop:hadoop-hdfs:jar:3.4.0-SNAPSHOT
>   +-org.openlabtesting.leveldbjni:leveldbjni-all:jar:1.8:compile
> +-org.fusesource.leveldbjni:leveldbjni-osx:jar:1.8:provided
>   +-org.fusesource.leveldbjni:leveldbjni:jar:1.8:provided
> +-org.fusesource.hawtjni:hawtjni-runtime:jar:1.9:provided
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16523) Fix dependency error in hadoop-hdfs

2022-03-26 Thread Akira Ajisaka (Jira)
Akira Ajisaka created HDFS-16523:


 Summary: Fix dependency error in hadoop-hdfs
 Key: HDFS-16523
 URL: https://issues.apache.org/jira/browse/HDFS-16523
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
 Environment: M1 Pro Mac
Reporter: Akira Ajisaka
Assignee: Akira Ajisaka


hadoop-hdfs build is failing on docker with M1 Mac.
{code}
[WARNING]
Dependency convergence error for
org.fusesource.hawtjni:hawtjni-runtime:jar:1.11:provided paths to
dependency are:
+-org.apache.hadoop:hadoop-hdfs:jar:3.4.0-SNAPSHOT
  +-org.openlabtesting.leveldbjni:leveldbjni-all:jar:1.8:compile
+-org.openlabtesting.leveldbjni:leveldbjni:jar:1.8:provided
  +-org.fusesource.hawtjni:hawtjni-runtime:jar:1.11:provided
and
+-org.apache.hadoop:hadoop-hdfs:jar:3.4.0-SNAPSHOT
  +-org.openlabtesting.leveldbjni:leveldbjni-all:jar:1.8:compile
+-org.fusesource.leveldbjni:leveldbjni-osx:jar:1.8:provided
  +-org.fusesource.leveldbjni:leveldbjni:jar:1.8:provided
+-org.fusesource.hawtjni:hawtjni-runtime:jar:1.9:provided
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14099) Unknown frame descriptor when decompressing multiple frames in ZStandardDecompressor

2022-02-27 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-14099:
-
Fix Version/s: 3.2.3
   (was: 3.2.4)

Cherry-picked to branch-3.2.3.

> Unknown frame descriptor when decompressing multiple frames in 
> ZStandardDecompressor
> 
>
> Key: HDFS-14099
> URL: https://issues.apache.org/jira/browse/HDFS-14099
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Hadoop Version: hadoop-3.0.3
> Java Version: 1.8.0_144
>Reporter: xuzq
>Assignee: xuzq
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
> Attachments: HDFS-14099-trunk-001.patch, HDFS-14099-trunk-002.patch, 
> HDFS-14099-trunk-003.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We need to use the ZSTD compression algorithm in Hadoop. So I write a simple 
> demo like this for testing.
> {code:java}
> // code placeholder
> while ((size = fsDataInputStream.read(bufferV2)) > 0 ) {
>   countSize += size;
>   if (countSize == 65536 * 8) {
> if(!isFinished) {
>   // finish a frame in zstd
>   cmpOut.finish();
>   isFinished = true;
> }
> fsDataOutputStream.flush();
> fsDataOutputStream.hflush();
>   }
>   if(isFinished) {
> LOG.info("Will resetState. N=" + n);
> // reset the stream and write again
> cmpOut.resetState();
> isFinished = false;
>   }
>   cmpOut.write(bufferV2, 0, size);
>   bufferV2 = new byte[5 * 1024 * 1024];
>   n++;
> }
> {code}
>  
> And I use "*hadoop fs -text*"  to read this file and failed. The error as 
> blow.
> {code:java}
> Exception in thread "main" java.lang.InternalError: Unknown frame descriptor
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateBytesDirect(Native
>  Method)
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.decompress(ZStandardDecompressor.java:181)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:111)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
> at java.io.InputStream.read(InputStream.java:101)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:98)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:66)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:127)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:101)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:96)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:303)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:285)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:269)
> at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:119)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
> {code}
>  
> So I had to look the code, include jni, then found this bug.
> *ZSTD_initDStream(stream)* method may by called twice in the same *Frame*.
> The first is  in *ZStandardDecompressor.c.* 
> {code:java}
> if (size == 0) {
> (*env)->SetBooleanField(env, this, ZStandardDecompressor_finished, 
> JNI_TRUE);
> size_t result = dlsym_ZSTD_initDStream(stream);
> if (dlsym_ZSTD_isError(result)) {
> THROW(env, "java/lang/InternalError", 
> dlsym_ZSTD_getErrorName(result));
> return (jint) 0;
> }
> }
> {code}
> This call here is correct, but *Finished* no longer be set to false, even if 
> there is some datas (a new frame) in *CompressedBuffer* or *UserBuffer* need 
> to be decompressed.
> The second is in *org.apache.hadoop.io.compress.DecompressorStream* by 
> *decompressor.reset()*, because *Finished* is always true after decompressed 
> a *Frame*.
> {code:java}
> if (decompressor.finished()) {
>   // First see if there was any leftover buffered input from previous
>   // stream; if not, attempt to refill buffer.  If refill -> EOF, we're
>   // all done; else reset, fix up input buffer, and get ready for next
>   // concatenated substream/"member".
>   int nRemaining = decompressor.getRemaining();
>   if (nRemaining == 0) {
> int m = getCompressedData();
> if (m == -1) {
>   // apparently the previous end-of-stream was also end-of-file:
>   // return success, as if we had never called getCompressedData()
>   eof = true;
>   return -1;
> }
> decompressor.reset();
> 

[jira] [Commented] (HDFS-14626) Decommission all nodes hosting last block of open file succeeds unexpectedly

2022-02-22 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17495944#comment-17495944
 ] 

Akira Ajisaka commented on HDFS-14626:
--

Hi [~sodonnell], how is this issue going? I think it is a bug, but now I don't 
think we can fix it easily.

Currently DatanodeAdminManager does not manage the number of UC blocks for each 
DN, so we need to add the mechanism to return the number of UC blocks for DN by 
O(1). Maybe we can get approximate value by 
DatanodeDescriptor#getBlocksScheduled(), but I'm not sure the value can be used 
for decommission.

> Decommission all nodes hosting last block of open file succeeds unexpectedly 
> -
>
> Key: HDFS-14626
> URL: https://issues.apache.org/jira/browse/HDFS-14626
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: test-to-reproduce.patch
>
>
> I have been investigating scenarios that cause decommission to hang, 
> especially around one long standing issue. That is, an open block on the host 
> which is being decommissioned can cause the process to never complete.
> Checking the history, there seems to have been at least one change in 
> HDFS-5579 which greatly improved the situation, but from reading comments and 
> support cases, there still seems to be some scenarios where open blocks on a 
> DN host cause the decommission to get stuck.
> No matter what I try, I have not been able to reproduce this, but I think I 
> have uncovered another issue that may partly explain why.
> If I do the following, the nodes will decommission without any issues:
> 1. Create a file and write to it so it crosses a block boundary. Then there 
> is one complete block and one under construction block. Keep the file open, 
> and write a few bytes periodically.
> 2. Now note the nodes which the UC block is currently being written on, and 
> decommission them all.
> 3. The decommission should succeed.
> 4. Now attempt to close the open file, and it will fail to close with an 
> error like below, probably as decommissioned nodes are not allowed to send 
> IBRs:
> {code:java}
> java.io.IOException: Unable to close file because the last block 
> BP-646926902-192.168.0.20-1562099323291:blk_1073741827_1003 does not have 
> enough number of replicas.
>     at 
> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:968)
>     at 
> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:911)
>     at 
> org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:894)
>     at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:849)
>     at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>     at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101){code}
> Interestingly, if you recommission the nodes without restarting them before 
> closing the file, it will close OK, and writes to it can continue even once 
> decommission has completed.
> I don't think this is expected - ie decommission should not complete on all 
> nodes hosting the last UC block of a file?
> From what I have figured out, I don't think UC blocks are considered in the 
> DatanodeAdminManager at all. This is because the original list of blocks it 
> cares about, are taken from the Datanode block Iterator, which takes them 
> from the DatanodeStorageInfo objects attached to the datanode instance. I 
> believe UC blocks don't make it into the DatanodeStoreageInfo until after 
> they have been completed and an IBR sent, so the decommission logic never 
> considers them.
> What troubles me about this explanation, is how did open files previously 
> cause decommission to get stuck if it never checks for them, so I suspect I 
> am missing something.
> I will attach a patch with a test case that demonstrates this issue. This 
> reproduces on trunk and I also tested on CDH 5.8.1, which is based on the 2.6 
> branch, but with a lot of backports.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16449) Fix hadoop web site release notes and changelog not available

2022-02-13 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16449:
-
Fix Version/s: 3.2.4

Backported to branch-3.2.

> Fix hadoop web site release notes and changelog not available
> -
>
> Key: HDFS-16449
> URL: https://issues.apache.org/jira/browse/HDFS-16449
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.3.1
>Reporter: guophilipse
>Assignee: guophilipse
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Fix hadoop web site release notes and changelog not available



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16449) Fix hadoop web site release notes and changelog not available

2022-02-13 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16449:
-
Fix Version/s: 3.3.3

Backported to branch-3.3.

> Fix hadoop web site release notes and changelog not available
> -
>
> Key: HDFS-16449
> URL: https://issues.apache.org/jira/browse/HDFS-16449
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.3.1
>Reporter: guophilipse
>Assignee: guophilipse
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Fix hadoop web site release notes and changelog not available



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16449) Fix hadoop web site release notes and changelog not available

2022-02-13 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-16449:


Assignee: guophilipse

> Fix hadoop web site release notes and changelog not available
> -
>
> Key: HDFS-16449
> URL: https://issues.apache.org/jira/browse/HDFS-16449
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.3.1
>Reporter: guophilipse
>Assignee: guophilipse
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Fix hadoop web site release notes and changelog not available



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16449) Fix hadoop web site release notes and changelog not available

2022-02-13 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16449:
-
Issue Type: Bug  (was: Improvement)

> Fix hadoop web site release notes and changelog not available
> -
>
> Key: HDFS-16449
> URL: https://issues.apache.org/jira/browse/HDFS-16449
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.3.1
>Reporter: guophilipse
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Fix hadoop web site release notes and changelog not available



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16449) Fix hadoop web site release notes and changelog not available

2022-02-13 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16449.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Merged the PR into trunk.

> Fix hadoop web site release notes and changelog not available
> -
>
> Key: HDFS-16449
> URL: https://issues.apache.org/jira/browse/HDFS-16449
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.1
>Reporter: guophilipse
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Fix hadoop web site release notes and changelog not available



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16443) Fix edge case where DatanodeAdminDefaultMonitor doubly enqueues a DatanodeDescriptor on exception

2022-01-30 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16443.
--
Fix Version/s: 3.2.4
   Resolution: Fixed

Backported to branch-3.2.

> Fix edge case where DatanodeAdminDefaultMonitor doubly enqueues a 
> DatanodeDescriptor on exception
> -
>
> Key: HDFS-16443
> URL: https://issues.apache.org/jira/browse/HDFS-16443
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As part of the fix merged in: https://issues.apache.org/jira/browse/HDFS-16303
> There was a rare edge case noticed in DatanodeAdminDefaultMonitor which 
> causes a DatanodeDescriptor to be added twice to the pendingNodes queue. 
>  * a [datanode is unhealthy so it gets added to 
> "unhealthyDns"]([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L227)]
>  * an exception is thrown which causes [this catch 
> block](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L271)
>  to execute
>  * the [datanode is added to 
> "pendingNodes"]([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L276)]
>  * under certain conditions the [datanode can be added again from 
> "unhealthyDns" to "pendingNodes" 
> here]([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L296)]
> This Jira is to track the 1 line fix for this bug



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16303) Losing over 100 datanodes in state decommissioning results in full blockage of all datanode decommissioning

2022-01-30 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16303.
--
Resolution: Fixed

> Losing over 100 datanodes in state decommissioning results in full blockage 
> of all datanode decommissioning
> ---
>
> Key: HDFS-16303
> URL: https://issues.apache.org/jira/browse/HDFS-16303
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.1, 3.3.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.3
>
>  Time Spent: 17h 50m
>  Remaining Estimate: 0h
>
> h2. Impact
> HDFS datanode decommissioning does not make any forward progress. For 
> example, the user adds X datanodes to the "dfs.hosts.exclude" file and all X 
> of those datanodes remain in state decommissioning forever without making any 
> forward progress towards being decommissioned.
> h2. Root Cause
> The HDFS Namenode class "DatanodeAdminManager" is responsible for 
> decommissioning datanodes.
> As per this "hdfs-site" configuration:
> {quote}Config = dfs.namenode.decommission.max.concurrent.tracked.nodes 
>  Default Value = 100
> The maximum number of decommission-in-progress datanodes nodes that will be 
> tracked at one time by the namenode. Tracking a decommission-in-progress 
> datanode consumes additional NN memory proportional to the number of blocks 
> on the datnode. Having a conservative limit reduces the potential impact of 
> decomissioning a large number of nodes at once. A value of 0 means no limit 
> will be enforced.
> {quote}
> The Namenode will only actively track up to 100 datanodes for decommissioning 
> at any given time, as to avoid Namenode memory pressure.
> Looking into the "DatanodeAdminManager" code:
>  * a new datanode is only removed from the "tracked.nodes" set when it 
> finishes decommissioning
>  * a new datanode is only added to the "tracked.nodes" set if there is fewer 
> than 100 datanodes being tracked
> So in the event that there are more than 100 datanodes being decommissioned 
> at a given time, some of those datanodes will not be in the "tracked.nodes" 
> set until 1 or more datanodes in the "tracked.nodes" finishes 
> decommissioning. This is generally not a problem because the datanodes in 
> "tracked.nodes" will eventually finish decommissioning, but there is an edge 
> case where this logic prevents the namenode from making any forward progress 
> towards decommissioning.
> If all 100 datanodes in the "tracked.nodes" are unable to finish 
> decommissioning, then other datanodes (which may be able to be 
> decommissioned) will never get added to "tracked.nodes" and therefore will 
> never get the opportunity to be decommissioned.
> This can occur due the following issue:
> {quote}2021-10-21 12:39:24,048 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager 
> (DatanodeAdminMonitor-0): Node W.X.Y.Z:50010 is dead while in Decommission In 
> Progress. Cannot be safely decommissioned or be in maintenance since there is 
> risk of reduced data durability or data loss. Either restart the failed node 
> or force decommissioning or maintenance by removing, calling refreshNodes, 
> then re-adding to the excludes or host config files.
> {quote}
> If a Datanode is lost while decommissioning (for example if the underlying 
> hardware fails or is lost), then it will remain in state decommissioning 
> forever.
> If 100 or more Datanodes are lost while decommissioning over the Hadoop 
> cluster lifetime, then this is enough to completely fill up the 
> "tracked.nodes" set. With the entire "tracked.nodes" set filled with 
> datanodes that can never finish decommissioning, any datanodes added after 
> this point will never be able to be decommissioned because they will never be 
> added to the "tracked.nodes" set.
> In this scenario:
>  * the "tracked.nodes" set is filled with datanodes which are lost & cannot 
> be recovered (and can never finish decommissioning so they will never be 
> removed from the set)
>  * the actual live datanodes being decommissioned are enqueued waiting to 
> enter the "tracked.nodes" set (and are stuck waiting indefinitely)
> This means that no progress towards decommissioning the live datanodes will 
> be made unless the user takes the following action:
> {quote}Either restart the failed node or force decommissioning or maintenance 
> by removing, calling refreshNodes, then re-adding to the excludes or host 
> config files.
> {quote}
> Ideally, the Namenode should be able to gracefully handle scenarios where the 
> datanodes in the "tracked.nodes" set are not making forward progress towards 
> decommissioning while the enqueued datanodes 

[jira] [Updated] (HDFS-16303) Losing over 100 datanodes in state decommissioning results in full blockage of all datanode decommissioning

2022-01-30 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16303:
-
Fix Version/s: 3.2.4

Merged [https://github.com/apache/hadoop/pull/3920] into branch-3.2. Thank you 
[~KevinWikant] for your contribution for backporting.

> Losing over 100 datanodes in state decommissioning results in full blockage 
> of all datanode decommissioning
> ---
>
> Key: HDFS-16303
> URL: https://issues.apache.org/jira/browse/HDFS-16303
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.1, 3.3.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.3
>
>  Time Spent: 17h 50m
>  Remaining Estimate: 0h
>
> h2. Impact
> HDFS datanode decommissioning does not make any forward progress. For 
> example, the user adds X datanodes to the "dfs.hosts.exclude" file and all X 
> of those datanodes remain in state decommissioning forever without making any 
> forward progress towards being decommissioned.
> h2. Root Cause
> The HDFS Namenode class "DatanodeAdminManager" is responsible for 
> decommissioning datanodes.
> As per this "hdfs-site" configuration:
> {quote}Config = dfs.namenode.decommission.max.concurrent.tracked.nodes 
>  Default Value = 100
> The maximum number of decommission-in-progress datanodes nodes that will be 
> tracked at one time by the namenode. Tracking a decommission-in-progress 
> datanode consumes additional NN memory proportional to the number of blocks 
> on the datnode. Having a conservative limit reduces the potential impact of 
> decomissioning a large number of nodes at once. A value of 0 means no limit 
> will be enforced.
> {quote}
> The Namenode will only actively track up to 100 datanodes for decommissioning 
> at any given time, as to avoid Namenode memory pressure.
> Looking into the "DatanodeAdminManager" code:
>  * a new datanode is only removed from the "tracked.nodes" set when it 
> finishes decommissioning
>  * a new datanode is only added to the "tracked.nodes" set if there is fewer 
> than 100 datanodes being tracked
> So in the event that there are more than 100 datanodes being decommissioned 
> at a given time, some of those datanodes will not be in the "tracked.nodes" 
> set until 1 or more datanodes in the "tracked.nodes" finishes 
> decommissioning. This is generally not a problem because the datanodes in 
> "tracked.nodes" will eventually finish decommissioning, but there is an edge 
> case where this logic prevents the namenode from making any forward progress 
> towards decommissioning.
> If all 100 datanodes in the "tracked.nodes" are unable to finish 
> decommissioning, then other datanodes (which may be able to be 
> decommissioned) will never get added to "tracked.nodes" and therefore will 
> never get the opportunity to be decommissioned.
> This can occur due the following issue:
> {quote}2021-10-21 12:39:24,048 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager 
> (DatanodeAdminMonitor-0): Node W.X.Y.Z:50010 is dead while in Decommission In 
> Progress. Cannot be safely decommissioned or be in maintenance since there is 
> risk of reduced data durability or data loss. Either restart the failed node 
> or force decommissioning or maintenance by removing, calling refreshNodes, 
> then re-adding to the excludes or host config files.
> {quote}
> If a Datanode is lost while decommissioning (for example if the underlying 
> hardware fails or is lost), then it will remain in state decommissioning 
> forever.
> If 100 or more Datanodes are lost while decommissioning over the Hadoop 
> cluster lifetime, then this is enough to completely fill up the 
> "tracked.nodes" set. With the entire "tracked.nodes" set filled with 
> datanodes that can never finish decommissioning, any datanodes added after 
> this point will never be able to be decommissioned because they will never be 
> added to the "tracked.nodes" set.
> In this scenario:
>  * the "tracked.nodes" set is filled with datanodes which are lost & cannot 
> be recovered (and can never finish decommissioning so they will never be 
> removed from the set)
>  * the actual live datanodes being decommissioned are enqueued waiting to 
> enter the "tracked.nodes" set (and are stuck waiting indefinitely)
> This means that no progress towards decommissioning the live datanodes will 
> be made unless the user takes the following action:
> {quote}Either restart the failed node or force decommissioning or maintenance 
> by removing, calling refreshNodes, then re-adding to the excludes or host 
> config files.
> {quote}
> Ideally, the Namenode should be able to gracefully handle scenarios 

[jira] [Updated] (HDFS-16443) Fix edge case where DatanodeAdminDefaultMonitor doubly enqueues a DatanodeDescriptor on exception

2022-01-30 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16443:
-
Fix Version/s: 3.3.3

Backported to branch-3.3.

> Fix edge case where DatanodeAdminDefaultMonitor doubly enqueues a 
> DatanodeDescriptor on exception
> -
>
> Key: HDFS-16443
> URL: https://issues.apache.org/jira/browse/HDFS-16443
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As part of the fix merged in: https://issues.apache.org/jira/browse/HDFS-16303
> There was a rare edge case noticed in DatanodeAdminDefaultMonitor which 
> causes a DatanodeDescriptor to be added twice to the pendingNodes queue. 
>  * a [datanode is unhealthy so it gets added to 
> "unhealthyDns"]([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L227)]
>  * an exception is thrown which causes [this catch 
> block](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L271)
>  to execute
>  * the [datanode is added to 
> "pendingNodes"]([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L276)]
>  * under certain conditions the [datanode can be added again from 
> "unhealthyDns" to "pendingNodes" 
> here]([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L296)]
> This Jira is to track the 1 line fix for this bug



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16303) Losing over 100 datanodes in state decommissioning results in full blockage of all datanode decommissioning

2022-01-30 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16303:
-
Fix Version/s: 3.3.3

Merged https://github.com/apache/hadoop/pull/3921 into branch-3.3.

> Losing over 100 datanodes in state decommissioning results in full blockage 
> of all datanode decommissioning
> ---
>
> Key: HDFS-16303
> URL: https://issues.apache.org/jira/browse/HDFS-16303
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.1, 3.3.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.3
>
>  Time Spent: 17h 40m
>  Remaining Estimate: 0h
>
> h2. Impact
> HDFS datanode decommissioning does not make any forward progress. For 
> example, the user adds X datanodes to the "dfs.hosts.exclude" file and all X 
> of those datanodes remain in state decommissioning forever without making any 
> forward progress towards being decommissioned.
> h2. Root Cause
> The HDFS Namenode class "DatanodeAdminManager" is responsible for 
> decommissioning datanodes.
> As per this "hdfs-site" configuration:
> {quote}Config = dfs.namenode.decommission.max.concurrent.tracked.nodes 
>  Default Value = 100
> The maximum number of decommission-in-progress datanodes nodes that will be 
> tracked at one time by the namenode. Tracking a decommission-in-progress 
> datanode consumes additional NN memory proportional to the number of blocks 
> on the datnode. Having a conservative limit reduces the potential impact of 
> decomissioning a large number of nodes at once. A value of 0 means no limit 
> will be enforced.
> {quote}
> The Namenode will only actively track up to 100 datanodes for decommissioning 
> at any given time, as to avoid Namenode memory pressure.
> Looking into the "DatanodeAdminManager" code:
>  * a new datanode is only removed from the "tracked.nodes" set when it 
> finishes decommissioning
>  * a new datanode is only added to the "tracked.nodes" set if there is fewer 
> than 100 datanodes being tracked
> So in the event that there are more than 100 datanodes being decommissioned 
> at a given time, some of those datanodes will not be in the "tracked.nodes" 
> set until 1 or more datanodes in the "tracked.nodes" finishes 
> decommissioning. This is generally not a problem because the datanodes in 
> "tracked.nodes" will eventually finish decommissioning, but there is an edge 
> case where this logic prevents the namenode from making any forward progress 
> towards decommissioning.
> If all 100 datanodes in the "tracked.nodes" are unable to finish 
> decommissioning, then other datanodes (which may be able to be 
> decommissioned) will never get added to "tracked.nodes" and therefore will 
> never get the opportunity to be decommissioned.
> This can occur due the following issue:
> {quote}2021-10-21 12:39:24,048 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager 
> (DatanodeAdminMonitor-0): Node W.X.Y.Z:50010 is dead while in Decommission In 
> Progress. Cannot be safely decommissioned or be in maintenance since there is 
> risk of reduced data durability or data loss. Either restart the failed node 
> or force decommissioning or maintenance by removing, calling refreshNodes, 
> then re-adding to the excludes or host config files.
> {quote}
> If a Datanode is lost while decommissioning (for example if the underlying 
> hardware fails or is lost), then it will remain in state decommissioning 
> forever.
> If 100 or more Datanodes are lost while decommissioning over the Hadoop 
> cluster lifetime, then this is enough to completely fill up the 
> "tracked.nodes" set. With the entire "tracked.nodes" set filled with 
> datanodes that can never finish decommissioning, any datanodes added after 
> this point will never be able to be decommissioned because they will never be 
> added to the "tracked.nodes" set.
> In this scenario:
>  * the "tracked.nodes" set is filled with datanodes which are lost & cannot 
> be recovered (and can never finish decommissioning so they will never be 
> removed from the set)
>  * the actual live datanodes being decommissioned are enqueued waiting to 
> enter the "tracked.nodes" set (and are stuck waiting indefinitely)
> This means that no progress towards decommissioning the live datanodes will 
> be made unless the user takes the following action:
> {quote}Either restart the failed node or force decommissioning or maintenance 
> by removing, calling refreshNodes, then re-adding to the excludes or host 
> config files.
> {quote}
> Ideally, the Namenode should be able to gracefully handle scenarios where the 
> datanodes in the "tracked.nodes" set are not making forward 

[jira] [Updated] (HDFS-16443) Fix edge case where DatanodeAdminDefaultMonitor doubly enqueues a DatanodeDescriptor on exception

2022-01-30 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16443:
-
Fix Version/s: 3.4.0

Merged the PR into trunk.

> Fix edge case where DatanodeAdminDefaultMonitor doubly enqueues a 
> DatanodeDescriptor on exception
> -
>
> Key: HDFS-16443
> URL: https://issues.apache.org/jira/browse/HDFS-16443
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As part of the fix merged in: https://issues.apache.org/jira/browse/HDFS-16303
> There was a rare edge case noticed in DatanodeAdminDefaultMonitor which 
> causes a DatanodeDescriptor to be added twice to the pendingNodes queue. 
>  * a [datanode is unhealthy so it gets added to 
> "unhealthyDns"]([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L227)]
>  * an exception is thrown which causes [this catch 
> block](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L271)
>  to execute
>  * the [datanode is added to 
> "pendingNodes"]([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L276)]
>  * under certain conditions the [datanode can be added again from 
> "unhealthyDns" to "pendingNodes" 
> here]([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L296)]
> This Jira is to track the 1 line fix for this bug



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16443) Fix edge case where DatanodeAdminDefaultMonitor doubly enqueues a DatanodeDescriptor on exception

2022-01-30 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-16443:


Assignee: Kevin Wikant

> Fix edge case where DatanodeAdminDefaultMonitor doubly enqueues a 
> DatanodeDescriptor on exception
> -
>
> Key: HDFS-16443
> URL: https://issues.apache.org/jira/browse/HDFS-16443
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As part of the fix merged in: https://issues.apache.org/jira/browse/HDFS-16303
> There was a rare edge case noticed in DatanodeAdminDefaultMonitor which 
> causes a DatanodeDescriptor to be added twice to the pendingNodes queue. 
>  * a [datanode is unhealthy so it gets added to 
> "unhealthyDns"]([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L227)]
>  * an exception is thrown which causes [this catch 
> block](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L271)
>  to execute
>  * the [datanode is added to 
> "pendingNodes"]([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L276)]
>  * under certain conditions the [datanode can be added again from 
> "unhealthyDns" to "pendingNodes" 
> here]([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java#L296)]
> This Jira is to track the 1 line fix for this bug



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16441) The following error occurs when accessing webhdfs in Kerberos security mode:Failed to obtain user group information: java.io.IOException: Security enabled but user not a

2022-01-28 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16441.
--
Resolution: Invalid

Hi [~xiaoqiuqiu] - In Apache Hadoop community, JIRA is used for the development 
and not for end-user questions. Please use 
[u...@hadoop.apache.org|mailto:u...@hadoop.apache.org] mailing list for 
end-user questions.

> The following error occurs when accessing webhdfs in Kerberos security 
> mode:Failed to obtain user group information: java.io.IOException: Security 
> enabled but user not authenticated by filter
> ---
>
> Key: HDFS-16441
> URL: https://issues.apache.org/jira/browse/HDFS-16441
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: 3.1.1
>Affects Versions: 3.3.1
>Reporter: xiaoqiuqiu
>Priority: Major
> Attachments: 1.png, 2.png, 3.png, 4.png
>
>
> The following error occurs when accessing webhdfs in Kerberos security 
> mode:Failed to obtain user group information: java.io.IOException: Security 
> enabled but user not authenticated by filter;
> When I use the browser to access, I still get the same error when the machine 
> has Kerberos authentication;
>  
> The code is the first in the comment area
> How to solve it?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16169) Fix TestBlockTokenWithDFSStriped#testEnd2End failure

2022-01-28 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16169:
-
Fix Version/s: 3.2.4

Backported to branch-3.2.

> Fix TestBlockTokenWithDFSStriped#testEnd2End failure
> 
>
> Key: HDFS-16169
> URL: https://issues.apache.org/jira/browse/HDFS-16169
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.4.0
>Reporter: Hui Fei
>Assignee: secfree
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.3
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 141.936 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped 
> [ERROR] 
> testEnd2End(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped)
>  Time elapsed: 28.325 s <<< FAILURE! java.lang.AssertionError: expected:<9> 
> but was:<10> at org.junit.Assert.fail(Assert.java:89) at 
> org.junit.Assert.failNotEquals(Assert.java:835) at 
> org.junit.Assert.assertEquals(Assert.java:647) at 
> org.junit.Assert.assertEquals(Assert.java:633) at 
> org.apache.hadoop.hdfs.StripedFileTestUtil.verifyLocatedStripedBlocks(StripedFileTestUtil.java:344)
>  at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTestBalancerWithStripedFile(TestBalancer.java:1666)
>  at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.integrationTestWithStripedFile(TestBalancer.java:1601)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped.testEnd2End(TestBlockTokenWithDFSStriped.java:119)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
>  
> CI result is 
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3296/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16169) Fix TestBlockTokenWithDFSStriped#testEnd2End failure

2022-01-28 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16169.
--
Fix Version/s: 3.4.0
   3.3.3
   Resolution: Fixed

Committed to trunk and branch-3.3. Thank you [~secfree.teng] for your 
contribution! Nice catch.

> Fix TestBlockTokenWithDFSStriped#testEnd2End failure
> 
>
> Key: HDFS-16169
> URL: https://issues.apache.org/jira/browse/HDFS-16169
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.4.0
>Reporter: Hui Fei
>Assignee: secfree
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.3
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 141.936 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped 
> [ERROR] 
> testEnd2End(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped)
>  Time elapsed: 28.325 s <<< FAILURE! java.lang.AssertionError: expected:<9> 
> but was:<10> at org.junit.Assert.fail(Assert.java:89) at 
> org.junit.Assert.failNotEquals(Assert.java:835) at 
> org.junit.Assert.assertEquals(Assert.java:647) at 
> org.junit.Assert.assertEquals(Assert.java:633) at 
> org.apache.hadoop.hdfs.StripedFileTestUtil.verifyLocatedStripedBlocks(StripedFileTestUtil.java:344)
>  at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTestBalancerWithStripedFile(TestBalancer.java:1666)
>  at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.integrationTestWithStripedFile(TestBalancer.java:1601)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped.testEnd2End(TestBlockTokenWithDFSStriped.java:119)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
>  
> CI result is 
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3296/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16169) Fix TestBlockTokenWithDFSStriped#testEnd2End failure

2022-01-28 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-16169:


Assignee: secfree

> Fix TestBlockTokenWithDFSStriped#testEnd2End failure
> 
>
> Key: HDFS-16169
> URL: https://issues.apache.org/jira/browse/HDFS-16169
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.4.0
>Reporter: Hui Fei
>Assignee: secfree
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 141.936 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped 
> [ERROR] 
> testEnd2End(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped)
>  Time elapsed: 28.325 s <<< FAILURE! java.lang.AssertionError: expected:<9> 
> but was:<10> at org.junit.Assert.fail(Assert.java:89) at 
> org.junit.Assert.failNotEquals(Assert.java:835) at 
> org.junit.Assert.assertEquals(Assert.java:647) at 
> org.junit.Assert.assertEquals(Assert.java:633) at 
> org.apache.hadoop.hdfs.StripedFileTestUtil.verifyLocatedStripedBlocks(StripedFileTestUtil.java:344)
>  at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTestBalancerWithStripedFile(TestBalancer.java:1666)
>  at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.integrationTestWithStripedFile(TestBalancer.java:1601)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped.testEnd2End(TestBlockTokenWithDFSStriped.java:119)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
>  
> CI result is 
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3296/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16169) Fix TestBlockTokenWithDFSStriped#testEnd2End failure

2022-01-28 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16169:
-
Summary: Fix TestBlockTokenWithDFSStriped#testEnd2End failure  (was: 
TestBlockTokenWithDFSStriped#testEnd2End fails)

> Fix TestBlockTokenWithDFSStriped#testEnd2End failure
> 
>
> Key: HDFS-16169
> URL: https://issues.apache.org/jira/browse/HDFS-16169
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.4.0
>Reporter: Hui Fei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 141.936 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped 
> [ERROR] 
> testEnd2End(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped)
>  Time elapsed: 28.325 s <<< FAILURE! java.lang.AssertionError: expected:<9> 
> but was:<10> at org.junit.Assert.fail(Assert.java:89) at 
> org.junit.Assert.failNotEquals(Assert.java:835) at 
> org.junit.Assert.assertEquals(Assert.java:647) at 
> org.junit.Assert.assertEquals(Assert.java:633) at 
> org.apache.hadoop.hdfs.StripedFileTestUtil.verifyLocatedStripedBlocks(StripedFileTestUtil.java:344)
>  at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTestBalancerWithStripedFile(TestBalancer.java:1666)
>  at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.integrationTestWithStripedFile(TestBalancer.java:1601)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped.testEnd2End(TestBlockTokenWithDFSStriped.java:119)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
>  
> CI result is 
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3296/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16303) Losing over 100 datanodes in state decommissioning results in full blockage of all datanode decommissioning

2022-01-24 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17480951#comment-17480951
 ] 

Akira Ajisaka commented on HDFS-16303:
--

Hi [~KevinWikant], how is this issue going?

> Losing over 100 datanodes in state decommissioning results in full blockage 
> of all datanode decommissioning
> ---
>
> Key: HDFS-16303
> URL: https://issues.apache.org/jira/browse/HDFS-16303
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.1, 3.3.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> h2. Impact
> HDFS datanode decommissioning does not make any forward progress. For 
> example, the user adds X datanodes to the "dfs.hosts.exclude" file and all X 
> of those datanodes remain in state decommissioning forever without making any 
> forward progress towards being decommissioned.
> h2. Root Cause
> The HDFS Namenode class "DatanodeAdminManager" is responsible for 
> decommissioning datanodes.
> As per this "hdfs-site" configuration:
> {quote}Config = dfs.namenode.decommission.max.concurrent.tracked.nodes 
>  Default Value = 100
> The maximum number of decommission-in-progress datanodes nodes that will be 
> tracked at one time by the namenode. Tracking a decommission-in-progress 
> datanode consumes additional NN memory proportional to the number of blocks 
> on the datnode. Having a conservative limit reduces the potential impact of 
> decomissioning a large number of nodes at once. A value of 0 means no limit 
> will be enforced.
> {quote}
> The Namenode will only actively track up to 100 datanodes for decommissioning 
> at any given time, as to avoid Namenode memory pressure.
> Looking into the "DatanodeAdminManager" code:
>  * a new datanode is only removed from the "tracked.nodes" set when it 
> finishes decommissioning
>  * a new datanode is only added to the "tracked.nodes" set if there is fewer 
> than 100 datanodes being tracked
> So in the event that there are more than 100 datanodes being decommissioned 
> at a given time, some of those datanodes will not be in the "tracked.nodes" 
> set until 1 or more datanodes in the "tracked.nodes" finishes 
> decommissioning. This is generally not a problem because the datanodes in 
> "tracked.nodes" will eventually finish decommissioning, but there is an edge 
> case where this logic prevents the namenode from making any forward progress 
> towards decommissioning.
> If all 100 datanodes in the "tracked.nodes" are unable to finish 
> decommissioning, then other datanodes (which may be able to be 
> decommissioned) will never get added to "tracked.nodes" and therefore will 
> never get the opportunity to be decommissioned.
> This can occur due the following issue:
> {quote}2021-10-21 12:39:24,048 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager 
> (DatanodeAdminMonitor-0): Node W.X.Y.Z:50010 is dead while in Decommission In 
> Progress. Cannot be safely decommissioned or be in maintenance since there is 
> risk of reduced data durability or data loss. Either restart the failed node 
> or force decommissioning or maintenance by removing, calling refreshNodes, 
> then re-adding to the excludes or host config files.
> {quote}
> If a Datanode is lost while decommissioning (for example if the underlying 
> hardware fails or is lost), then it will remain in state decommissioning 
> forever.
> If 100 or more Datanodes are lost while decommissioning over the Hadoop 
> cluster lifetime, then this is enough to completely fill up the 
> "tracked.nodes" set. With the entire "tracked.nodes" set filled with 
> datanodes that can never finish decommissioning, any datanodes added after 
> this point will never be able to be decommissioned because they will never be 
> added to the "tracked.nodes" set.
> In this scenario:
>  * the "tracked.nodes" set is filled with datanodes which are lost & cannot 
> be recovered (and can never finish decommissioning so they will never be 
> removed from the set)
>  * the actual live datanodes being decommissioned are enqueued waiting to 
> enter the "tracked.nodes" set (and are stuck waiting indefinitely)
> This means that no progress towards decommissioning the live datanodes will 
> be made unless the user takes the following action:
> {quote}Either restart the failed node or force decommissioning or maintenance 
> by removing, calling refreshNodes, then re-adding to the excludes or host 
> config files.
> {quote}
> Ideally, the Namenode should be able to gracefully handle scenarios where the 
> datanodes in the "tracked.nodes" set are not making forward progress towards 

[jira] [Commented] (HDFS-16316) Improve DirectoryScanner: add regular file check related block

2022-01-16 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17477006#comment-17477006
 ] 

Akira Ajisaka commented on HDFS-16316:
--

Do you know why did the abnormal block appear?

> Improve DirectoryScanner: add regular file check related block
> --
>
> Key: HDFS-16316
> URL: https://issues.apache.org/jira/browse/HDFS-16316
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Something unusual happened in the online environment.
> The DataNode is configured with 11 disks (${dfs.datanode.data.dir}). It is 
> normal for 10 disks to calculate the used capacity, and the calculated value 
> for the other 1 disk is much larger, which is very strange.
> This is about the live view on the NameNode:
>  !screenshot-1.png! 
> This is about the live view on the DataNode:
>  !screenshot-2.png! 
> We can look at the view on linux:
>  !screenshot-3.png! 
> There is a big gap here, regarding'/mnt/dfs/11/data'. This situation should 
> be prohibited from happening.
> I found that there are some abnormal block files.
> There are wrong blk_.meta in some subdir directories, causing abnormal 
> computing space.
> Here are some abnormal block files:
>  !screenshot-4.png! 
> Such files should not be used as normal blocks. They should be actively 
> identified and filtered, which is good for cluster stability.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16410) Insecure Xml parsing in OfflineEditsXmlLoader

2022-01-10 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16410:
-
Fix Version/s: 2.10.2
   3.2.3

Backported to branch-3.2, branch-3.2.3, and branch-2.10.

> Insecure Xml parsing in OfflineEditsXmlLoader 
> --
>
> Key: HDFS-16410
> URL: https://issues.apache.org/jira/browse/HDFS-16410
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Ashutosh Gupta
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: pull-request-available, security
> Fix For: 3.4.0, 2.10.2, 3.2.3, 3.3.2
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Insecure Xml parsing in OfflineEditsXmlLoader 
> [https://github.com/apache/hadoop/blob/03cfc852791c14fad39db4e5b14104a276c08e59/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/OfflineEditsXmlLoader.java#L88]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16409) Fix typo: testHasExeceptionsReturnsCorrectValue -> testHasExceptionsReturnsCorrectValue

2022-01-03 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16409.
--
Fix Version/s: 3.4.0
   3.2.4
   3.3.3
   Resolution: Fixed

Committed to trunk, branch-3.3, and branch-3.2. Thanks [~groot].

> Fix typo: testHasExeceptionsReturnsCorrectValue -> 
> testHasExceptionsReturnsCorrectValue
> ---
>
> Key: HDFS-16409
> URL: https://issues.apache.org/jira/browse/HDFS-16409
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ashutosh Gupta
>Assignee: Ashutosh Gupta
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.3
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Fixing typo testHasExeceptionsReturnsCorrectValue to 
> testHasExceptionsReturnsCorrectValue in 
> {code:java}
> hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestAddBlockPoolException.java{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16409) Fix typo: testHasExeceptionsReturnsCorrectValue -> testHasExceptionsReturnsCorrectValue

2022-01-03 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16409:
-
Priority: Trivial  (was: Major)

> Fix typo: testHasExeceptionsReturnsCorrectValue -> 
> testHasExceptionsReturnsCorrectValue
> ---
>
> Key: HDFS-16409
> URL: https://issues.apache.org/jira/browse/HDFS-16409
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ashutosh Gupta
>Assignee: Ashutosh Gupta
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Fixing typo testHasExeceptionsReturnsCorrectValue to 
> testHasExceptionsReturnsCorrectValue in 
> {code:java}
> hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestAddBlockPoolException.java{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16409) Fix typo: testHasExeceptionsReturnsCorrectValue -> testHasExceptionsReturnsCorrectValue

2022-01-03 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-16409:


Assignee: Ashutosh Gupta

> Fix typo: testHasExeceptionsReturnsCorrectValue -> 
> testHasExceptionsReturnsCorrectValue
> ---
>
> Key: HDFS-16409
> URL: https://issues.apache.org/jira/browse/HDFS-16409
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ashutosh Gupta
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Fixing typo testHasExeceptionsReturnsCorrectValue to 
> testHasExceptionsReturnsCorrectValue in 
> {code:java}
> hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestAddBlockPoolException.java{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16409) Fix typo: testHasExeceptionsReturnsCorrectValue -> testHasExceptionsReturnsCorrectValue

2022-01-03 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-16409:


Fix Version/s: (was: 3.4.0)
  Key: HDFS-16409  (was: HADOOP-18058)
 Assignee: (was: Ashutosh Gupta)
  Project: Hadoop HDFS  (was: Hadoop Common)

> Fix typo: testHasExeceptionsReturnsCorrectValue -> 
> testHasExceptionsReturnsCorrectValue
> ---
>
> Key: HDFS-16409
> URL: https://issues.apache.org/jira/browse/HDFS-16409
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Fixing typo testHasExeceptionsReturnsCorrectValue to 
> testHasExceptionsReturnsCorrectValue in 
> {code:java}
> hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestAddBlockPoolException.java{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16409) Fix typo: testHasExeceptionsReturnsCorrectValue -> testHasExceptionsReturnsCorrectValue

2022-01-03 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17468348#comment-17468348
 ] 

Akira Ajisaka commented on HDFS-16409:
--

Moved to HDFS project because the typo is in hadoop-hdfs module.

> Fix typo: testHasExeceptionsReturnsCorrectValue -> 
> testHasExceptionsReturnsCorrectValue
> ---
>
> Key: HDFS-16409
> URL: https://issues.apache.org/jira/browse/HDFS-16409
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Fixing typo testHasExeceptionsReturnsCorrectValue to 
> testHasExceptionsReturnsCorrectValue in 
> {code:java}
> hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestAddBlockPoolException.java{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14099) Unknown frame descriptor when decompressing multiple frames in ZStandardDecompressor

2021-12-28 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466111#comment-17466111
 ] 

Akira Ajisaka commented on HDFS-14099:
--

Committed to trunk, branch-3.3, and branch-3.2.

> Unknown frame descriptor when decompressing multiple frames in 
> ZStandardDecompressor
> 
>
> Key: HDFS-14099
> URL: https://issues.apache.org/jira/browse/HDFS-14099
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Hadoop Version: hadoop-3.0.3
> Java Version: 1.8.0_144
>Reporter: xuzq
>Assignee: xuzq
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.3
>
> Attachments: HDFS-14099-trunk-001.patch, HDFS-14099-trunk-002.patch, 
> HDFS-14099-trunk-003.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We need to use the ZSTD compression algorithm in Hadoop. So I write a simple 
> demo like this for testing.
> {code:java}
> // code placeholder
> while ((size = fsDataInputStream.read(bufferV2)) > 0 ) {
>   countSize += size;
>   if (countSize == 65536 * 8) {
> if(!isFinished) {
>   // finish a frame in zstd
>   cmpOut.finish();
>   isFinished = true;
> }
> fsDataOutputStream.flush();
> fsDataOutputStream.hflush();
>   }
>   if(isFinished) {
> LOG.info("Will resetState. N=" + n);
> // reset the stream and write again
> cmpOut.resetState();
> isFinished = false;
>   }
>   cmpOut.write(bufferV2, 0, size);
>   bufferV2 = new byte[5 * 1024 * 1024];
>   n++;
> }
> {code}
>  
> And I use "*hadoop fs -text*"  to read this file and failed. The error as 
> blow.
> {code:java}
> Exception in thread "main" java.lang.InternalError: Unknown frame descriptor
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateBytesDirect(Native
>  Method)
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.decompress(ZStandardDecompressor.java:181)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:111)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
> at java.io.InputStream.read(InputStream.java:101)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:98)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:66)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:127)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:101)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:96)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:303)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:285)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:269)
> at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:119)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
> {code}
>  
> So I had to look the code, include jni, then found this bug.
> *ZSTD_initDStream(stream)* method may by called twice in the same *Frame*.
> The first is  in *ZStandardDecompressor.c.* 
> {code:java}
> if (size == 0) {
> (*env)->SetBooleanField(env, this, ZStandardDecompressor_finished, 
> JNI_TRUE);
> size_t result = dlsym_ZSTD_initDStream(stream);
> if (dlsym_ZSTD_isError(result)) {
> THROW(env, "java/lang/InternalError", 
> dlsym_ZSTD_getErrorName(result));
> return (jint) 0;
> }
> }
> {code}
> This call here is correct, but *Finished* no longer be set to false, even if 
> there is some datas (a new frame) in *CompressedBuffer* or *UserBuffer* need 
> to be decompressed.
> The second is in *org.apache.hadoop.io.compress.DecompressorStream* by 
> *decompressor.reset()*, because *Finished* is always true after decompressed 
> a *Frame*.
> {code:java}
> if (decompressor.finished()) {
>   // First see if there was any leftover buffered input from previous
>   // stream; if not, attempt to refill buffer.  If refill -> EOF, we're
>   // all done; else reset, fix up input buffer, and get ready for next
>   // concatenated substream/"member".
>   int nRemaining = decompressor.getRemaining();
>   if (nRemaining == 0) {
> int m = getCompressedData();
> if (m == -1) {
>   // apparently the previous end-of-stream was also end-of-file:
>   // return success, as if we had never called getCompressedData()
>   eof = true;
>   return -1;
> }
> decompressor.reset();
> 

[jira] [Updated] (HDFS-14099) Unknown frame descriptor when decompressing multiple frames in ZStandardDecompressor

2021-12-28 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-14099:
-
Fix Version/s: 3.4.0
   3.2.4
   3.3.3
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Unknown frame descriptor when decompressing multiple frames in 
> ZStandardDecompressor
> 
>
> Key: HDFS-14099
> URL: https://issues.apache.org/jira/browse/HDFS-14099
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Hadoop Version: hadoop-3.0.3
> Java Version: 1.8.0_144
>Reporter: xuzq
>Assignee: xuzq
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.3
>
> Attachments: HDFS-14099-trunk-001.patch, HDFS-14099-trunk-002.patch, 
> HDFS-14099-trunk-003.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We need to use the ZSTD compression algorithm in Hadoop. So I write a simple 
> demo like this for testing.
> {code:java}
> // code placeholder
> while ((size = fsDataInputStream.read(bufferV2)) > 0 ) {
>   countSize += size;
>   if (countSize == 65536 * 8) {
> if(!isFinished) {
>   // finish a frame in zstd
>   cmpOut.finish();
>   isFinished = true;
> }
> fsDataOutputStream.flush();
> fsDataOutputStream.hflush();
>   }
>   if(isFinished) {
> LOG.info("Will resetState. N=" + n);
> // reset the stream and write again
> cmpOut.resetState();
> isFinished = false;
>   }
>   cmpOut.write(bufferV2, 0, size);
>   bufferV2 = new byte[5 * 1024 * 1024];
>   n++;
> }
> {code}
>  
> And I use "*hadoop fs -text*"  to read this file and failed. The error as 
> blow.
> {code:java}
> Exception in thread "main" java.lang.InternalError: Unknown frame descriptor
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateBytesDirect(Native
>  Method)
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.decompress(ZStandardDecompressor.java:181)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:111)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
> at java.io.InputStream.read(InputStream.java:101)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:98)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:66)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:127)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:101)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:96)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:303)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:285)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:269)
> at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:119)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
> {code}
>  
> So I had to look the code, include jni, then found this bug.
> *ZSTD_initDStream(stream)* method may by called twice in the same *Frame*.
> The first is  in *ZStandardDecompressor.c.* 
> {code:java}
> if (size == 0) {
> (*env)->SetBooleanField(env, this, ZStandardDecompressor_finished, 
> JNI_TRUE);
> size_t result = dlsym_ZSTD_initDStream(stream);
> if (dlsym_ZSTD_isError(result)) {
> THROW(env, "java/lang/InternalError", 
> dlsym_ZSTD_getErrorName(result));
> return (jint) 0;
> }
> }
> {code}
> This call here is correct, but *Finished* no longer be set to false, even if 
> there is some datas (a new frame) in *CompressedBuffer* or *UserBuffer* need 
> to be decompressed.
> The second is in *org.apache.hadoop.io.compress.DecompressorStream* by 
> *decompressor.reset()*, because *Finished* is always true after decompressed 
> a *Frame*.
> {code:java}
> if (decompressor.finished()) {
>   // First see if there was any leftover buffered input from previous
>   // stream; if not, attempt to refill buffer.  If refill -> EOF, we're
>   // all done; else reset, fix up input buffer, and get ready for next
>   // concatenated substream/"member".
>   int nRemaining = decompressor.getRemaining();
>   if (nRemaining == 0) {
> int m = getCompressedData();
> if (m == -1) {
>   // apparently the previous end-of-stream was also end-of-file:
>   // return success, as if we had never called getCompressedData()
>   eof = true;
>   

[jira] [Commented] (HDFS-14099) Unknown frame descriptor when decompressing multiple frames in ZStandardDecompressor

2021-12-28 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466105#comment-17466105
 ] 

Akira Ajisaka commented on HDFS-14099:
--

[~groot] rebased the patch and I merged this. Thank you [~xuzq_zander] and 
[~groot] for your contribution!

Sorry I should have moved this project to HADOOP instead of HDFS.

> Unknown frame descriptor when decompressing multiple frames in 
> ZStandardDecompressor
> 
>
> Key: HDFS-14099
> URL: https://issues.apache.org/jira/browse/HDFS-14099
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Hadoop Version: hadoop-3.0.3
> Java Version: 1.8.0_144
>Reporter: xuzq
>Assignee: xuzq
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-14099-trunk-001.patch, HDFS-14099-trunk-002.patch, 
> HDFS-14099-trunk-003.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We need to use the ZSTD compression algorithm in Hadoop. So I write a simple 
> demo like this for testing.
> {code:java}
> // code placeholder
> while ((size = fsDataInputStream.read(bufferV2)) > 0 ) {
>   countSize += size;
>   if (countSize == 65536 * 8) {
> if(!isFinished) {
>   // finish a frame in zstd
>   cmpOut.finish();
>   isFinished = true;
> }
> fsDataOutputStream.flush();
> fsDataOutputStream.hflush();
>   }
>   if(isFinished) {
> LOG.info("Will resetState. N=" + n);
> // reset the stream and write again
> cmpOut.resetState();
> isFinished = false;
>   }
>   cmpOut.write(bufferV2, 0, size);
>   bufferV2 = new byte[5 * 1024 * 1024];
>   n++;
> }
> {code}
>  
> And I use "*hadoop fs -text*"  to read this file and failed. The error as 
> blow.
> {code:java}
> Exception in thread "main" java.lang.InternalError: Unknown frame descriptor
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateBytesDirect(Native
>  Method)
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.decompress(ZStandardDecompressor.java:181)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:111)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
> at java.io.InputStream.read(InputStream.java:101)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:98)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:66)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:127)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:101)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:96)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:303)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:285)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:269)
> at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:119)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
> {code}
>  
> So I had to look the code, include jni, then found this bug.
> *ZSTD_initDStream(stream)* method may by called twice in the same *Frame*.
> The first is  in *ZStandardDecompressor.c.* 
> {code:java}
> if (size == 0) {
> (*env)->SetBooleanField(env, this, ZStandardDecompressor_finished, 
> JNI_TRUE);
> size_t result = dlsym_ZSTD_initDStream(stream);
> if (dlsym_ZSTD_isError(result)) {
> THROW(env, "java/lang/InternalError", 
> dlsym_ZSTD_getErrorName(result));
> return (jint) 0;
> }
> }
> {code}
> This call here is correct, but *Finished* no longer be set to false, even if 
> there is some datas (a new frame) in *CompressedBuffer* or *UserBuffer* need 
> to be decompressed.
> The second is in *org.apache.hadoop.io.compress.DecompressorStream* by 
> *decompressor.reset()*, because *Finished* is always true after decompressed 
> a *Frame*.
> {code:java}
> if (decompressor.finished()) {
>   // First see if there was any leftover buffered input from previous
>   // stream; if not, attempt to refill buffer.  If refill -> EOF, we're
>   // all done; else reset, fix up input buffer, and get ready for next
>   // concatenated substream/"member".
>   int nRemaining = decompressor.getRemaining();
>   if (nRemaining == 0) {
> int m = getCompressedData();
> if (m == -1) {
>   // apparently the previous end-of-stream was also end-of-file:
>   // return success, as if we had never called getCompressedData()
>   eof = 

[jira] [Updated] (HDFS-16395) Remove useless NNThroughputBenchmark#dummyActionNoSynch()

2021-12-24 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16395:
-
Issue Type: Bug  (was: Improvement)

> Remove useless NNThroughputBenchmark#dummyActionNoSynch()
> -
>
> Key: HDFS-16395
> URL: https://issues.apache.org/jira/browse/HDFS-16395
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: benchmarks, namenode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.3
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Doesn't seem to be used anywhere NNThroughputBenchmark#dummyActionNoSynch(), 
> It is recommended to delete it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16395) Remove useless NNThroughputBenchmark#dummyActionNoSynch()

2021-12-24 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16395.
--
Fix Version/s: 3.4.0
   3.2.4
   3.3.3
   Resolution: Fixed

Committed to trunk, branch-3.3, and branch-3.2. Thank you [~jianghuazhu] for 
your contribution!

> Remove useless NNThroughputBenchmark#dummyActionNoSynch()
> -
>
> Key: HDFS-16395
> URL: https://issues.apache.org/jira/browse/HDFS-16395
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: benchmarks, namenode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.3
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Doesn't seem to be used anywhere NNThroughputBenchmark#dummyActionNoSynch(), 
> It is recommended to delete it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16303) Losing over 100 datanodes in state decommissioning results in full blockage of all datanode decommissioning

2021-12-23 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17464588#comment-17464588
 ] 

Akira Ajisaka commented on HDFS-16303:
--

Thank you for your contribution, [~KevinWikant]. Would you provide PR for 
branch-3.3 and branch-3.2? The commit cannot be cherry-picked cleanly.

> Losing over 100 datanodes in state decommissioning results in full blockage 
> of all datanode decommissioning
> ---
>
> Key: HDFS-16303
> URL: https://issues.apache.org/jira/browse/HDFS-16303
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.1, 3.3.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> h2. Impact
> HDFS datanode decommissioning does not make any forward progress. For 
> example, the user adds X datanodes to the "dfs.hosts.exclude" file and all X 
> of those datanodes remain in state decommissioning forever without making any 
> forward progress towards being decommissioned.
> h2. Root Cause
> The HDFS Namenode class "DatanodeAdminManager" is responsible for 
> decommissioning datanodes.
> As per this "hdfs-site" configuration:
> {quote}Config = dfs.namenode.decommission.max.concurrent.tracked.nodes 
>  Default Value = 100
> The maximum number of decommission-in-progress datanodes nodes that will be 
> tracked at one time by the namenode. Tracking a decommission-in-progress 
> datanode consumes additional NN memory proportional to the number of blocks 
> on the datnode. Having a conservative limit reduces the potential impact of 
> decomissioning a large number of nodes at once. A value of 0 means no limit 
> will be enforced.
> {quote}
> The Namenode will only actively track up to 100 datanodes for decommissioning 
> at any given time, as to avoid Namenode memory pressure.
> Looking into the "DatanodeAdminManager" code:
>  * a new datanode is only removed from the "tracked.nodes" set when it 
> finishes decommissioning
>  * a new datanode is only added to the "tracked.nodes" set if there is fewer 
> than 100 datanodes being tracked
> So in the event that there are more than 100 datanodes being decommissioned 
> at a given time, some of those datanodes will not be in the "tracked.nodes" 
> set until 1 or more datanodes in the "tracked.nodes" finishes 
> decommissioning. This is generally not a problem because the datanodes in 
> "tracked.nodes" will eventually finish decommissioning, but there is an edge 
> case where this logic prevents the namenode from making any forward progress 
> towards decommissioning.
> If all 100 datanodes in the "tracked.nodes" are unable to finish 
> decommissioning, then other datanodes (which may be able to be 
> decommissioned) will never get added to "tracked.nodes" and therefore will 
> never get the opportunity to be decommissioned.
> This can occur due the following issue:
> {quote}2021-10-21 12:39:24,048 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager 
> (DatanodeAdminMonitor-0): Node W.X.Y.Z:50010 is dead while in Decommission In 
> Progress. Cannot be safely decommissioned or be in maintenance since there is 
> risk of reduced data durability or data loss. Either restart the failed node 
> or force decommissioning or maintenance by removing, calling refreshNodes, 
> then re-adding to the excludes or host config files.
> {quote}
> If a Datanode is lost while decommissioning (for example if the underlying 
> hardware fails or is lost), then it will remain in state decommissioning 
> forever.
> If 100 or more Datanodes are lost while decommissioning over the Hadoop 
> cluster lifetime, then this is enough to completely fill up the 
> "tracked.nodes" set. With the entire "tracked.nodes" set filled with 
> datanodes that can never finish decommissioning, any datanodes added after 
> this point will never be able to be decommissioned because they will never be 
> added to the "tracked.nodes" set.
> In this scenario:
>  * the "tracked.nodes" set is filled with datanodes which are lost & cannot 
> be recovered (and can never finish decommissioning so they will never be 
> removed from the set)
>  * the actual live datanodes being decommissioned are enqueued waiting to 
> enter the "tracked.nodes" set (and are stuck waiting indefinitely)
> This means that no progress towards decommissioning the live datanodes will 
> be made unless the user takes the following action:
> {quote}Either restart the failed node or force decommissioning or maintenance 
> by removing, calling refreshNodes, then re-adding to the excludes or host 
> config files.
> {quote}
> Ideally, the Namenode should be able to gracefully 

[jira] [Updated] (HDFS-16303) Losing over 100 datanodes in state decommissioning results in full blockage of all datanode decommissioning

2021-12-23 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16303:
-
Fix Version/s: 3.4.0

> Losing over 100 datanodes in state decommissioning results in full blockage 
> of all datanode decommissioning
> ---
>
> Key: HDFS-16303
> URL: https://issues.apache.org/jira/browse/HDFS-16303
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.1, 3.3.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> h2. Impact
> HDFS datanode decommissioning does not make any forward progress. For 
> example, the user adds X datanodes to the "dfs.hosts.exclude" file and all X 
> of those datanodes remain in state decommissioning forever without making any 
> forward progress towards being decommissioned.
> h2. Root Cause
> The HDFS Namenode class "DatanodeAdminManager" is responsible for 
> decommissioning datanodes.
> As per this "hdfs-site" configuration:
> {quote}Config = dfs.namenode.decommission.max.concurrent.tracked.nodes 
>  Default Value = 100
> The maximum number of decommission-in-progress datanodes nodes that will be 
> tracked at one time by the namenode. Tracking a decommission-in-progress 
> datanode consumes additional NN memory proportional to the number of blocks 
> on the datnode. Having a conservative limit reduces the potential impact of 
> decomissioning a large number of nodes at once. A value of 0 means no limit 
> will be enforced.
> {quote}
> The Namenode will only actively track up to 100 datanodes for decommissioning 
> at any given time, as to avoid Namenode memory pressure.
> Looking into the "DatanodeAdminManager" code:
>  * a new datanode is only removed from the "tracked.nodes" set when it 
> finishes decommissioning
>  * a new datanode is only added to the "tracked.nodes" set if there is fewer 
> than 100 datanodes being tracked
> So in the event that there are more than 100 datanodes being decommissioned 
> at a given time, some of those datanodes will not be in the "tracked.nodes" 
> set until 1 or more datanodes in the "tracked.nodes" finishes 
> decommissioning. This is generally not a problem because the datanodes in 
> "tracked.nodes" will eventually finish decommissioning, but there is an edge 
> case where this logic prevents the namenode from making any forward progress 
> towards decommissioning.
> If all 100 datanodes in the "tracked.nodes" are unable to finish 
> decommissioning, then other datanodes (which may be able to be 
> decommissioned) will never get added to "tracked.nodes" and therefore will 
> never get the opportunity to be decommissioned.
> This can occur due the following issue:
> {quote}2021-10-21 12:39:24,048 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager 
> (DatanodeAdminMonitor-0): Node W.X.Y.Z:50010 is dead while in Decommission In 
> Progress. Cannot be safely decommissioned or be in maintenance since there is 
> risk of reduced data durability or data loss. Either restart the failed node 
> or force decommissioning or maintenance by removing, calling refreshNodes, 
> then re-adding to the excludes or host config files.
> {quote}
> If a Datanode is lost while decommissioning (for example if the underlying 
> hardware fails or is lost), then it will remain in state decommissioning 
> forever.
> If 100 or more Datanodes are lost while decommissioning over the Hadoop 
> cluster lifetime, then this is enough to completely fill up the 
> "tracked.nodes" set. With the entire "tracked.nodes" set filled with 
> datanodes that can never finish decommissioning, any datanodes added after 
> this point will never be able to be decommissioned because they will never be 
> added to the "tracked.nodes" set.
> In this scenario:
>  * the "tracked.nodes" set is filled with datanodes which are lost & cannot 
> be recovered (and can never finish decommissioning so they will never be 
> removed from the set)
>  * the actual live datanodes being decommissioned are enqueued waiting to 
> enter the "tracked.nodes" set (and are stuck waiting indefinitely)
> This means that no progress towards decommissioning the live datanodes will 
> be made unless the user takes the following action:
> {quote}Either restart the failed node or force decommissioning or maintenance 
> by removing, calling refreshNodes, then re-adding to the excludes or host 
> config files.
> {quote}
> Ideally, the Namenode should be able to gracefully handle scenarios where the 
> datanodes in the "tracked.nodes" set are not making forward progress towards 
> decommissioning while the enqueued datanodes may be able 

[jira] [Commented] (HDFS-14099) Unknown frame descriptor when decompressing multiple frames in ZStandardDecompressor

2021-12-22 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17464291#comment-17464291
 ] 

Akira Ajisaka commented on HDFS-14099:
--

After our internal testing, we found HDFS-14099 is also required as well as 
HADOOP-17096. The 003 patch looks good to me. However, after applying the patch 
to trunk, the compile fails in the test code due to the version up in 
commons-io.

Hi [~xuzq_zander], would you rebase to the latest trunk? Note that now it's 
recommended to create a PR in GitHub rather than attaching the patch to JIRA.

> Unknown frame descriptor when decompressing multiple frames in 
> ZStandardDecompressor
> 
>
> Key: HDFS-14099
> URL: https://issues.apache.org/jira/browse/HDFS-14099
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Hadoop Version: hadoop-3.0.3
> Java Version: 1.8.0_144
>Reporter: xuzq
>Assignee: xuzq
>Priority: Major
> Attachments: HDFS-14099-trunk-001.patch, HDFS-14099-trunk-002.patch, 
> HDFS-14099-trunk-003.patch
>
>
> We need to use the ZSTD compression algorithm in Hadoop. So I write a simple 
> demo like this for testing.
> {code:java}
> // code placeholder
> while ((size = fsDataInputStream.read(bufferV2)) > 0 ) {
>   countSize += size;
>   if (countSize == 65536 * 8) {
> if(!isFinished) {
>   // finish a frame in zstd
>   cmpOut.finish();
>   isFinished = true;
> }
> fsDataOutputStream.flush();
> fsDataOutputStream.hflush();
>   }
>   if(isFinished) {
> LOG.info("Will resetState. N=" + n);
> // reset the stream and write again
> cmpOut.resetState();
> isFinished = false;
>   }
>   cmpOut.write(bufferV2, 0, size);
>   bufferV2 = new byte[5 * 1024 * 1024];
>   n++;
> }
> {code}
>  
> And I use "*hadoop fs -text*"  to read this file and failed. The error as 
> blow.
> {code:java}
> Exception in thread "main" java.lang.InternalError: Unknown frame descriptor
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateBytesDirect(Native
>  Method)
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.decompress(ZStandardDecompressor.java:181)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:111)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
> at java.io.InputStream.read(InputStream.java:101)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:98)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:66)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:127)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:101)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:96)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:303)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:285)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:269)
> at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:119)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
> {code}
>  
> So I had to look the code, include jni, then found this bug.
> *ZSTD_initDStream(stream)* method may by called twice in the same *Frame*.
> The first is  in *ZStandardDecompressor.c.* 
> {code:java}
> if (size == 0) {
> (*env)->SetBooleanField(env, this, ZStandardDecompressor_finished, 
> JNI_TRUE);
> size_t result = dlsym_ZSTD_initDStream(stream);
> if (dlsym_ZSTD_isError(result)) {
> THROW(env, "java/lang/InternalError", 
> dlsym_ZSTD_getErrorName(result));
> return (jint) 0;
> }
> }
> {code}
> This call here is correct, but *Finished* no longer be set to false, even if 
> there is some datas (a new frame) in *CompressedBuffer* or *UserBuffer* need 
> to be decompressed.
> The second is in *org.apache.hadoop.io.compress.DecompressorStream* by 
> *decompressor.reset()*, because *Finished* is always true after decompressed 
> a *Frame*.
> {code:java}
> if (decompressor.finished()) {
>   // First see if there was any leftover buffered input from previous
>   // stream; if not, attempt to refill buffer.  If refill -> EOF, we're
>   // all done; else reset, fix up input buffer, and get ready for next
>   // concatenated substream/"member".
>   int nRemaining = decompressor.getRemaining();
>   if (nRemaining == 0) {
> int m = getCompressedData();
> if (m == -1) {
>   // apparently the previous 

  1   2   3   4   5   6   7   8   9   10   >