[jira] [Assigned] (HDFS-15674) TestBPOfferService#testMissBlocksWhenReregister fails on trunk

2020-11-16 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki reassigned HDFS-15674:
---

Assignee: Masatake Iwasaki

> TestBPOfferService#testMissBlocksWhenReregister fails on trunk
> --
>
> Key: HDFS-15674
> URL: https://issues.apache.org/jira/browse/HDFS-15674
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Masatake Iwasaki
>Priority: Major
>
> qbt report (Nov 8, 2020, 11:28 AM) shows failures timing out in 
> testMissBlocksWhenReregister 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13576) RBF: Add destination path length validation for add/update mount entry

2020-11-16 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17233325#comment-17233325
 ] 

JiangHua Zhu commented on HDFS-13576:
-

The following is another test of mine, set 
dfs.namenode.fs-limits.max-component-length=2, and execute the mkdir operation, 
and the following exception is displayed:
org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException: 
The maximum path component name limit of 333 in directory / is exceeded: 
limit=2 length=3
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxComponentLength(FSDirectory.java:1267)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:1369)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.unprotectedMkdir(FSDirMkdirOp.java:225)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createSingleDirectory(FSDirMkdirOp.java:169)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:77)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3475)
at 
org.apache.hadoop.hdfs.server.namenode.TestFsLimits.mkdirs(TestFsLimits.java:263)
at 
org.apache.hadoop.hdfs.server.namenode.TestFsLimits.testMaxComponentLength(TestFsLimits.java:90)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)

> RBF: Add destination path length validation for add/update mount entry
> --
>
> Key: HDFS-13576
> URL: https://issues.apache.org/jira/browse/HDFS-13576
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Ayush Saxena
>Priority: Minor
>
> Currently there is no validation to check destination path length while 
> adding or updating mount entry. But while trying to create directory using 
> this mount entry 
> {noformat}
> RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException){noformat}
> is thrown with exception message as 
> {noformat}
> "maximum path component name limit of ... directory / is 
> exceeded: limit=255 length=1817"{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13576) RBF: Add destination path length validation for add/update mount entry

2020-11-16 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17233324#comment-17233324
 ] 

JiangHua Zhu commented on HDFS-13576:
-

The following is another test of mine, set 
dfs.namenode.fs-limits.max-component-length=2, and execute the mkdir operation, 
and the following exception is displayed:
org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException: 
The maximum path component name limit of 333 in directory / is exceeded: 
limit=2 length=3
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxComponentLength(FSDirectory.java:1267)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:1369)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.unprotectedMkdir(FSDirMkdirOp.java:225)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createSingleDirectory(FSDirMkdirOp.java:169)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:77)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3475)
at 
org.apache.hadoop.hdfs.server.namenode.TestFsLimits.mkdirs(TestFsLimits.java:263)
at 
org.apache.hadoop.hdfs.server.namenode.TestFsLimits.testMaxComponentLength(TestFsLimits.java:90)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
与此原文有关的更多信息

> RBF: Add destination path length validation for add/update mount entry
> --
>
> Key: HDFS-13576
> URL: https://issues.apache.org/jira/browse/HDFS-13576
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Ayush Saxena
>Priority: Minor
>
> Currently there is no validation to check destination path length while 
> adding or updating mount entry. But while trying to create directory using 
> this mount entry 
> {noformat}
> RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException){noformat}
> is thrown with exception message as 
> {noformat}
> "maximum path component name limit of ... directory / is 
> exceeded: limit=255 length=1817"{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-13576) RBF: Add destination path length validation for add/update mount entry

2020-11-16 Thread JiangHua Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JiangHua Zhu updated HDFS-13576:

Comment: was deleted

(was: The following is another test of mine, set 
dfs.namenode.fs-limits.max-component-length=2, and execute the mkdir operation, 
and the following exception is displayed:
org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException: 
The maximum path component name limit of 333 in directory / is exceeded: 
limit=2 length=3
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxComponentLength(FSDirectory.java:1267)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:1369)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.unprotectedMkdir(FSDirMkdirOp.java:225)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createSingleDirectory(FSDirMkdirOp.java:169)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:77)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3475)
at 
org.apache.hadoop.hdfs.server.namenode.TestFsLimits.mkdirs(TestFsLimits.java:263)
at 
org.apache.hadoop.hdfs.server.namenode.TestFsLimits.testMaxComponentLength(TestFsLimits.java:90)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
与此原文有关的更多信息)

> RBF: Add destination path length validation for add/update mount entry
> --
>
> Key: HDFS-13576
> URL: https://issues.apache.org/jira/browse/HDFS-13576
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Ayush Saxena
>Priority: Minor
>
> Currently there is no validation to check destination path length while 
> adding or updating mount entry. But while trying to create directory using 
> this mount entry 
> {noformat}
> RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException){noformat}
> is thrown with exception message as 
> {noformat}
> "maximum path component name limit of ... directory / is 
> exceeded: limit=255 length=1817"{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13576) RBF: Add destination path length validation for add/update mount entry

2020-11-16 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17233310#comment-17233310
 ] 

JiangHua Zhu commented on HDFS-13576:
-

This abnormality is normal. Related to the parameter 
(dfs.namenode.fs-limits.max-component-length), the default value of this 
parameter is 255. It represents the length of the name of the INode. Once it 
exceeds this value, the above exception will be displayed.

> RBF: Add destination path length validation for add/update mount entry
> --
>
> Key: HDFS-13576
> URL: https://issues.apache.org/jira/browse/HDFS-13576
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Ayush Saxena
>Priority: Minor
>
> Currently there is no validation to check destination path length while 
> adding or updating mount entry. But while trying to create directory using 
> this mount entry 
> {noformat}
> RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException){noformat}
> is thrown with exception message as 
> {noformat}
> "maximum path component name limit of ... directory / is 
> exceeded: limit=255 length=1817"{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15685) [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails

2020-11-16 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15685:
-
Fix Version/s: 3.4.0
   3.3.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Merged the PR into trunk and branch-3.3.

> [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS 
> fails
> 
>
> Key: HDFS-15685
> URL: https://issues.apache.org/jira/browse/HDFS-15685
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails after 
> [JDK-8225499|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8225499].
>  
> {noformat}
> [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.115 
> s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider
> [ERROR] 
> testResolveDomainNameUsingDNS(org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider)
>   Time elapsed: 0.964 s  <<< FAILURE!
> java.lang.AssertionError: nn1 wasn't returned: 
> {host02.test/:8020=25, host01.test/:8020=25}
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:295)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:320)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15685) [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails

2020-11-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15685?focusedWorklogId=512695=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-512695
 ]

ASF GitHub Bot logged work on HDFS-15685:
-

Author: ASF GitHub Bot
Created on: 17/Nov/20 01:57
Start Date: 17/Nov/20 01:57
Worklog Time Spent: 10m 
  Work Description: aajisaka merged pull request #2465:
URL: https://github.com/apache/hadoop/pull/2465


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 512695)
Time Spent: 0.5h  (was: 20m)

> [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS 
> fails
> 
>
> Key: HDFS-15685
> URL: https://issues.apache.org/jira/browse/HDFS-15685
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails after 
> [JDK-8225499|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8225499].
>  
> {noformat}
> [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.115 
> s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider
> [ERROR] 
> testResolveDomainNameUsingDNS(org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider)
>   Time elapsed: 0.964 s  <<< FAILURE!
> java.lang.AssertionError: nn1 wasn't returned: 
> {host02.test/:8020=25, host01.test/:8020=25}
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:295)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:320)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> 

[jira] [Work logged] (HDFS-15685) [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails

2020-11-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15685?focusedWorklogId=512696=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-512696
 ]

ASF GitHub Bot logged work on HDFS-15685:
-

Author: ASF GitHub Bot
Created on: 17/Nov/20 01:57
Start Date: 17/Nov/20 01:57
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #2465:
URL: https://github.com/apache/hadoop/pull/2465#issuecomment-728635417


   Thank you @sunchao for your review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 512696)
Time Spent: 40m  (was: 0.5h)

> [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS 
> fails
> 
>
> Key: HDFS-15685
> URL: https://issues.apache.org/jira/browse/HDFS-15685
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails after 
> [JDK-8225499|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8225499].
>  
> {noformat}
> [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.115 
> s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider
> [ERROR] 
> testResolveDomainNameUsingDNS(org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider)
>   Time elapsed: 0.964 s  <<< FAILURE!
> java.lang.AssertionError: nn1 wasn't returned: 
> {host02.test/:8020=25, host01.test/:8020=25}
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:295)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:320)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> 

[jira] [Commented] (HDFS-15562) StandbyCheckpointer will do checkpoint repeatedly while connecting observer/active namenode failed

2020-11-16 Thread Aihua Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17233178#comment-17233178
 ] 

Aihua Xu commented on HDFS-15562:
-

Thanks [~shv] for your comment. When I get time, I will focus on not recreating 
the image if there is a recent one. 

> StandbyCheckpointer will do checkpoint repeatedly while connecting 
> observer/active namenode failed
> --
>
> Key: HDFS-15562
> URL: https://issues.apache.org/jira/browse/HDFS-15562
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: SunHao
>Assignee: Aihua Xu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15562.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We find the standby namenode will do checkpoint over and over while 
> connecting observer/active namenode failed.
> StandbyCheckpointer won't update “lastCheckpointTime” when upload new fsimage 
> to the other namenode failed, so that the standby namenode will keep doing 
> checkpoint repeatedly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15686) Provide documentation for ViewHDFS

2020-11-16 Thread Uma Maheswara Rao G (Jira)
Uma Maheswara Rao G created HDFS-15686:
--

 Summary: Provide documentation for ViewHDFS
 Key: HDFS-15686
 URL: https://issues.apache.org/jira/browse/HDFS-15686
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: viewfs, viewfsOverloadScheme, ViewHDFS
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15680) Disable Broken Azure Junits

2020-11-16 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232952#comment-17232952
 ] 

Steve Loughran commented on HDFS-15680:
---

With the coding changes of Ayush's patch and some more details in the reporting:
{code}
java.lang.AssertionError: Failed to encode key: 
http://mockAccount.blob.core.windows.net/mockContainer/user/stevel/*;=[]%!#$'()/!#$'()*;=[]%:
  java.net.URISyntaxException: Illegal character in path at index 70: 
http://mockAccount.blob.core.windows.net/mockContainer/user/stevel/*;=[]%!#$'()/!#$'()*;=[]%
 -- "["

at 
org.apache.hadoop.fs.azure.MockStorageInterface.convertKeyToEncodedUri(MockStorageInterface.java:154)
at 
org.apache.hadoop.fs.azure.MockStorageInterface.access$300(MockStorageInterface.java:70)
at 
org.apache.hadoop.fs.azure.MockStorageInterface$MockCloudBlobDirectoryWrapper.listBlobs(MockStorageInterface.java:337)
at 
org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.listRootBlobs(AzureNativeFileSystemStore.java:1921)
at 
org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.listInternal(AzureNativeFileSystemStore.java:2320)
at 
org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.list(AzureNativeFileSystemStore.java:2295)
at 
org.apache.hadoop.fs.azure.NativeAzureFileSystem.listWithErrorHandling(NativeAzureFileSystem.java:2876)
at 
org.apache.hadoop.fs.azure.NativeAzureFileSystem.listStatus(NativeAzureFileSystem.java:2822)
at 
org.apache.hadoop.fs.azure.NativeAzureFileSystemBaseTest.testUriEncodingMoreComplexCharacters(NativeAzureFileSystemBaseTest.java:443)
{code}

So: its the [ symbol.

This is where I am now questioning the merits of this specific test. All it is 
doing is verifying that we can save and restore a value in a map; it is doing 
conversion in the process. But this for the mock. I care more that the 
production FS supports the chars than anything else

could we, should we, be ruthless here, subclass this specific test case and 
downgrade to a skip?

Before doing that, of course, we will have to run the full production ITests; I 
will make sure i have the setup there and do that

> Disable Broken Azure Junits
> ---
>
> Key: HDFS-15680
> URL: https://issues.apache.org/jira/browse/HDFS-15680
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are 6 test classes have been failing on Yetus for several months. 
> They contributed to more than 41 failing tests which makes reviewing Yetus 
> reports every a pain in the neck. Another point is to save the resources and 
> avoiding utilization of ports, memory, and CPU.
> Over the last month, there was some effort to bring the Yetus back to a 
> stable state. However, there is no progress in addressing Azure failures.
> Generally, I do not like to disable failing tests, but for this specific 
> case, I do not assume that it makes any sense to have 41 failing tests from 
> one module for several months. Whenever someone finds that those tests are 
> useful, then they can re-enable the tests on Yetus *_After_* the test is 
> fixed.
> Following a PR, I have to  review that my patch does not cause any failures 
> (include changing error messages in existing tests). A thorough review takes 
> a considerable amount of time browsing the nightly builds and Github reports.
> So, please consider how much time is being spent to review those stack trace 
> over the last months.
> Finally, this is one of the reasons developers tend to ignore the reports, 
> because it would take too much time to review; and by default, the errors are 
> considered irrelevant.
> CC: [~aajisaka], [~elgoiri], [~weichiu], [~ayushtkn]
> {code:bash}
>   hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked 
>hadoop.fs.azure.TestNativeAzureFileSystemMocked 
>hadoop.fs.azure.TestBlobMetadata 
>hadoop.fs.azure.TestNativeAzureFileSystemConcurrency 
>hadoop.fs.azure.TestNativeAzureFileSystemFileNameCheck 
>hadoop.fs.azure.TestNativeAzureFileSystemContractMocked 
>hadoop.fs.azure.TestWasbFsck 
>hadoop.fs.azure.TestOutOfBandAzureBlobOperations 
> {code}
> {code:bash}
> org.apache.hadoop.fs.azure.TestBlobMetadata.testFolderMetadata
> org.apache.hadoop.fs.azure.TestBlobMetadata.testFirstContainerVersionMetadata
> org.apache.hadoop.fs.azure.TestBlobMetadata.testPermissionMetadata
> org.apache.hadoop.fs.azure.TestBlobMetadata.testOldPermissionMetadata
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemConcurrency.testNoTempBlobsVisible
> 

[jira] [Commented] (HDFS-15684) EC: Call recoverLease on DFSStripedOutputStream close exception

2020-11-16 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232912#comment-17232912
 ] 

Hadoop QA commented on HDFS-15684:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 37m  
6s{color} |  | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} |  | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch does not contain any @author tags. 
{color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} |  | {color:green} The patch appears to include 1 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
7s{color} |  | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
 3s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
20s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
2s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
35s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 33s{color} |  | {color:green} branch has no errors when building and 
testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
19s{color} |  | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
49s{color} |  | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} |  | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
15s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
3s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m  
3s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
4s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m  
4s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} blanks {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch has no blanks issues. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 52s{color} | 
[/results-checkstyle-hadoop-hdfs-project.txt|https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/304/artifact/out/results-checkstyle-hadoop-hdfs-project.txt]
 | {color:orange} hadoop-hdfs-project: The patch generated 4 new + 33 unchanged 
- 0 fixed = 37 total (was 33) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
0s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 56s{color} |  | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
25s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 

[jira] [Commented] (HDFS-15680) Disable Broken Azure Junits

2020-11-16 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232901#comment-17232901
 ] 

Steve Loughran commented on HDFS-15680:
---

If it wasn't for the fact that the httpcomponents upgrade involved CVEs I'd 
just roll that back and say "broken". 

> Disable Broken Azure Junits
> ---
>
> Key: HDFS-15680
> URL: https://issues.apache.org/jira/browse/HDFS-15680
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are 6 test classes have been failing on Yetus for several months. 
> They contributed to more than 41 failing tests which makes reviewing Yetus 
> reports every a pain in the neck. Another point is to save the resources and 
> avoiding utilization of ports, memory, and CPU.
> Over the last month, there was some effort to bring the Yetus back to a 
> stable state. However, there is no progress in addressing Azure failures.
> Generally, I do not like to disable failing tests, but for this specific 
> case, I do not assume that it makes any sense to have 41 failing tests from 
> one module for several months. Whenever someone finds that those tests are 
> useful, then they can re-enable the tests on Yetus *_After_* the test is 
> fixed.
> Following a PR, I have to  review that my patch does not cause any failures 
> (include changing error messages in existing tests). A thorough review takes 
> a considerable amount of time browsing the nightly builds and Github reports.
> So, please consider how much time is being spent to review those stack trace 
> over the last months.
> Finally, this is one of the reasons developers tend to ignore the reports, 
> because it would take too much time to review; and by default, the errors are 
> considered irrelevant.
> CC: [~aajisaka], [~elgoiri], [~weichiu], [~ayushtkn]
> {code:bash}
>   hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked 
>hadoop.fs.azure.TestNativeAzureFileSystemMocked 
>hadoop.fs.azure.TestBlobMetadata 
>hadoop.fs.azure.TestNativeAzureFileSystemConcurrency 
>hadoop.fs.azure.TestNativeAzureFileSystemFileNameCheck 
>hadoop.fs.azure.TestNativeAzureFileSystemContractMocked 
>hadoop.fs.azure.TestWasbFsck 
>hadoop.fs.azure.TestOutOfBandAzureBlobOperations 
> {code}
> {code:bash}
> org.apache.hadoop.fs.azure.TestBlobMetadata.testFolderMetadata
> org.apache.hadoop.fs.azure.TestBlobMetadata.testFirstContainerVersionMetadata
> org.apache.hadoop.fs.azure.TestBlobMetadata.testPermissionMetadata
> org.apache.hadoop.fs.azure.TestBlobMetadata.testOldPermissionMetadata
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemConcurrency.testNoTempBlobsVisible
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemConcurrency.testLinkBlobs
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testListStatusRootDir
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testRenameDirectoryMoveToExistingDirectory
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testListStatus
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testRenameDirectoryAsExistingDirectory
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testRenameToDirWithSamePrefixAllowed
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testLSRootDir
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testDeleteRecursively
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemFileNameCheck.testWasbFsck
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testChineseCharactersFolderRename
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRedoRenameFolderInFolderListingWithZeroByteRenameMetadata
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRedoRenameFolderInFolderListing
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testUriEncoding
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testDeepFileCreation
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testListDirectory
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRedoRenameFolderRenameInProgress
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRenameFolder
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRenameImplicitFolder
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRedoRenameFolder
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testStoreDeleteFolder
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRename
> 

[jira] [Work logged] (HDFS-15685) [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails

2020-11-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15685?focusedWorklogId=512385=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-512385
 ]

ASF GitHub Bot logged work on HDFS-15685:
-

Author: ASF GitHub Bot
Created on: 16/Nov/20 14:22
Start Date: 16/Nov/20 14:22
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2465:
URL: https://github.com/apache/hadoop/pull/2465#issuecomment-728093034


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  29m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 30s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m  3s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |   0m 54s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 26s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 56s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 45s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   2m 26s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   2m 24s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 51s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 52s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |   0m 52s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 46s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |   0m 46s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 49s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  15m 21s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   0m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   2m 32s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 19s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 33s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 116m 39s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2465/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2465 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 92be35c5ddaf 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / dd85a90da6f |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2465/1/testReport/ |
   | Max. process+thread count | 693 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-client U: 
hadoop-hdfs-project/hadoop-hdfs-client |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2465/1/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT 

[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-11-16 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232775#comment-17232775
 ] 

Hadoop QA commented on HDFS-15240:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
47s{color} |  | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} |  | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch does not contain any @author tags. 
{color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} |  | {color:green} The patch appears to include 2 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} |  | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
28s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 23m 
27s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
11s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
45s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
55s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
23m 55s{color} |  | {color:green} branch has no errors when building and 
testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
44s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
42s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
30s{color} |  | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  8m 
36s{color} |  | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} |  | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
58s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
53s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 21m 
53s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m  
3s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 19m  
3s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} blanks {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch has no blanks issues. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 45s{color} | 
[/results-checkstyle-root.txt|https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/303/artifact/out/results-checkstyle-root.txt]
 | {color:orange} root: The patch generated 1 new + 27 unchanged - 0 fixed = 28 
total (was 27) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
57s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
26m 39s{color} |  | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
47s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
41s{color} |  | {color:green} the patch passed 

[jira] [Commented] (HDFS-15413) DFSStripedInputStream throws exception when datanodes close idle connections

2020-11-16 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232754#comment-17232754
 ] 

Hui Fei commented on HDFS-15413:


[~lalapala] Thanks for involving me. I will take a look later.

> DFSStripedInputStream throws exception when datanodes close idle connections
> 
>
> Key: HDFS-15413
> URL: https://issues.apache.org/jira/browse/HDFS-15413
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, erasure-coding, hdfs-client
>Affects Versions: 3.1.3
> Environment: - Hadoop 3.1.3
> - erasure coding with ISA-L and RS-3-2-1024k scheme
> - running in kubernetes
> - dfs.client.socket-timeout = 1
> - dfs.datanode.socket.write.timeout = 1
>Reporter: Andrey Elenskiy
>Priority: Critical
> Attachments: out.log
>
>
> We've run into an issue with compactions failing in HBase when erasure coding 
> is enabled on a table directory. After digging further I was able to narrow 
> it down to a seek + read logic and able to reproduce the issue with hdfs 
> client only:
> {code:java}
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.FSDataInputStream;
> public class ReaderRaw {
> public static void main(final String[] args) throws Exception {
> Path p = new Path(args[0]);
> int bufLen = Integer.parseInt(args[1]);
> int sleepDuration = Integer.parseInt(args[2]);
> int countBeforeSleep = Integer.parseInt(args[3]);
> int countAfterSleep = Integer.parseInt(args[4]);
> Configuration conf = new Configuration();
> FSDataInputStream istream = FileSystem.get(conf).open(p);
> byte[] buf = new byte[bufLen];
> int readTotal = 0;
> int count = 0;
> try {
>   while (true) {
> istream.seek(readTotal);
> int bytesRemaining = bufLen;
> int bufOffset = 0;
> while (bytesRemaining > 0) {
>   int nread = istream.read(buf, 0, bufLen);
>   if (nread < 0) {
>   throw new Exception("nread is less than zero");
>   }
>   readTotal += nread;
>   bufOffset += nread;
>   bytesRemaining -= nread;
> }
> count++;
> if (count == countBeforeSleep) {
> System.out.println("sleeping for " + sleepDuration + " 
> milliseconds");
> Thread.sleep(sleepDuration);
> System.out.println("resuming");
> }
> if (count == countBeforeSleep + countAfterSleep) {
> System.out.println("done");
> break;
> }
>   }
> } catch (Exception e) {
> System.out.println("exception on read " + count + " read total " 
> + readTotal);
> throw e;
> }
> }
> }
> {code}
> The issue appears to be due to the fact that datanodes close the connection 
> of EC client if it doesn't fetch next packet for longer than 
> dfs.client.socket-timeout. The EC client doesn't retry and instead assumes 
> that those datanodes went away resulting in "missing blocks" exception.
> I was able to consistently reproduce with the following arguments:
> {noformat}
> bufLen = 100 (just below 1MB which is the size of the stripe) 
> sleepDuration = (dfs.client.socket-timeout + 1) * 1000 (in our case 11000)
> countBeforeSleep = 1
> countAfterSleep = 7
> {noformat}
> I've attached the entire log output of running the snippet above against 
> erasure coded file with RS-3-2-1024k policy. And here are the logs from 
> datanodes of disconnecting the client:
> datanode 1:
> {noformat}
> 2020-06-15 19:06:20,697 INFO datanode.DataNode: Likely the client has stopped 
> reading, disconnecting it (datanode-v11-0-hadoop.hadoop:9866:DataXceiver 
> error processing READ_BLOCK operation  src: /10.128.23.40:53748 dst: 
> /10.128.14.46:9866); java.net.SocketTimeoutException: 1 millis timeout 
> while waiting for channel to be ready for write. ch : 
> java.nio.channels.SocketChannel[connected local=/10.128.14.46:9866 
> remote=/10.128.23.40:53748]
> {noformat}
> datanode 2:
> {noformat}
> 2020-06-15 19:06:20,341 INFO datanode.DataNode: Likely the client has stopped 
> reading, disconnecting it (datanode-v11-1-hadoop.hadoop:9866:DataXceiver 
> error processing READ_BLOCK operation  src: /10.128.23.40:48772 dst: 
> /10.128.9.42:9866); java.net.SocketTimeoutException: 1 millis timeout 
> while waiting for channel to be ready for write. ch : 
> java.nio.channels.SocketChannel[connected local=/10.128.9.42:9866 
> remote=/10.128.23.40:48772]
> {noformat}
> datanode 3:
> {noformat}
> 2020-06-15 19:06:20,467 INFO 

[jira] [Commented] (HDFS-15413) DFSStripedInputStream throws exception when datanodes close idle connections

2020-11-16 Thread gaozhan ding (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232751#comment-17232751
 ] 

gaozhan ding commented on HDFS-15413:
-

[~ferhui] Do you have any suggestions on this issue? We are facing the same 
problem.

> DFSStripedInputStream throws exception when datanodes close idle connections
> 
>
> Key: HDFS-15413
> URL: https://issues.apache.org/jira/browse/HDFS-15413
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, erasure-coding, hdfs-client
>Affects Versions: 3.1.3
> Environment: - Hadoop 3.1.3
> - erasure coding with ISA-L and RS-3-2-1024k scheme
> - running in kubernetes
> - dfs.client.socket-timeout = 1
> - dfs.datanode.socket.write.timeout = 1
>Reporter: Andrey Elenskiy
>Priority: Critical
> Attachments: out.log
>
>
> We've run into an issue with compactions failing in HBase when erasure coding 
> is enabled on a table directory. After digging further I was able to narrow 
> it down to a seek + read logic and able to reproduce the issue with hdfs 
> client only:
> {code:java}
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.FSDataInputStream;
> public class ReaderRaw {
> public static void main(final String[] args) throws Exception {
> Path p = new Path(args[0]);
> int bufLen = Integer.parseInt(args[1]);
> int sleepDuration = Integer.parseInt(args[2]);
> int countBeforeSleep = Integer.parseInt(args[3]);
> int countAfterSleep = Integer.parseInt(args[4]);
> Configuration conf = new Configuration();
> FSDataInputStream istream = FileSystem.get(conf).open(p);
> byte[] buf = new byte[bufLen];
> int readTotal = 0;
> int count = 0;
> try {
>   while (true) {
> istream.seek(readTotal);
> int bytesRemaining = bufLen;
> int bufOffset = 0;
> while (bytesRemaining > 0) {
>   int nread = istream.read(buf, 0, bufLen);
>   if (nread < 0) {
>   throw new Exception("nread is less than zero");
>   }
>   readTotal += nread;
>   bufOffset += nread;
>   bytesRemaining -= nread;
> }
> count++;
> if (count == countBeforeSleep) {
> System.out.println("sleeping for " + sleepDuration + " 
> milliseconds");
> Thread.sleep(sleepDuration);
> System.out.println("resuming");
> }
> if (count == countBeforeSleep + countAfterSleep) {
> System.out.println("done");
> break;
> }
>   }
> } catch (Exception e) {
> System.out.println("exception on read " + count + " read total " 
> + readTotal);
> throw e;
> }
> }
> }
> {code}
> The issue appears to be due to the fact that datanodes close the connection 
> of EC client if it doesn't fetch next packet for longer than 
> dfs.client.socket-timeout. The EC client doesn't retry and instead assumes 
> that those datanodes went away resulting in "missing blocks" exception.
> I was able to consistently reproduce with the following arguments:
> {noformat}
> bufLen = 100 (just below 1MB which is the size of the stripe) 
> sleepDuration = (dfs.client.socket-timeout + 1) * 1000 (in our case 11000)
> countBeforeSleep = 1
> countAfterSleep = 7
> {noformat}
> I've attached the entire log output of running the snippet above against 
> erasure coded file with RS-3-2-1024k policy. And here are the logs from 
> datanodes of disconnecting the client:
> datanode 1:
> {noformat}
> 2020-06-15 19:06:20,697 INFO datanode.DataNode: Likely the client has stopped 
> reading, disconnecting it (datanode-v11-0-hadoop.hadoop:9866:DataXceiver 
> error processing READ_BLOCK operation  src: /10.128.23.40:53748 dst: 
> /10.128.14.46:9866); java.net.SocketTimeoutException: 1 millis timeout 
> while waiting for channel to be ready for write. ch : 
> java.nio.channels.SocketChannel[connected local=/10.128.14.46:9866 
> remote=/10.128.23.40:53748]
> {noformat}
> datanode 2:
> {noformat}
> 2020-06-15 19:06:20,341 INFO datanode.DataNode: Likely the client has stopped 
> reading, disconnecting it (datanode-v11-1-hadoop.hadoop:9866:DataXceiver 
> error processing READ_BLOCK operation  src: /10.128.23.40:48772 dst: 
> /10.128.9.42:9866); java.net.SocketTimeoutException: 1 millis timeout 
> while waiting for channel to be ready for write. ch : 
> java.nio.channels.SocketChannel[connected local=/10.128.9.42:9866 
> remote=/10.128.23.40:48772]
> {noformat}
> datanode 3:
> {noformat}

[jira] [Updated] (HDFS-15685) [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails

2020-11-16 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15685:
-
Target Version/s: 3.4.0
  Status: Patch Available  (was: Open)

> [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS 
> fails
> 
>
> Key: HDFS-15685
> URL: https://issues.apache.org/jira/browse/HDFS-15685
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails after 
> [JDK-8225499|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8225499].
>  
> {noformat}
> [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.115 
> s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider
> [ERROR] 
> testResolveDomainNameUsingDNS(org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider)
>   Time elapsed: 0.964 s  <<< FAILURE!
> java.lang.AssertionError: nn1 wasn't returned: 
> {host02.test/:8020=25, host01.test/:8020=25}
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:295)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:320)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15685) [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails

2020-11-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15685?focusedWorklogId=512314=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-512314
 ]

ASF GitHub Bot logged work on HDFS-15685:
-

Author: ASF GitHub Bot
Created on: 16/Nov/20 12:24
Start Date: 16/Nov/20 12:24
Worklog Time Spent: 10m 
  Work Description: aajisaka opened a new pull request #2465:
URL: https://github.com/apache/hadoop/pull/2465


   JIRA: https://issues.apache.org/jira/browse/HDFS-15685
   
   Manually tested in JDK 13, 14, and 16-ea.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 512314)
Remaining Estimate: 0h
Time Spent: 10m

> [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS 
> fails
> 
>
> Key: HDFS-15685
> URL: https://issues.apache.org/jira/browse/HDFS-15685
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails after 
> [JDK-8225499|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8225499].
>  
> {noformat}
> [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.115 
> s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider
> [ERROR] 
> testResolveDomainNameUsingDNS(org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider)
>   Time elapsed: 0.964 s  <<< FAILURE!
> java.lang.AssertionError: nn1 wasn't returned: 
> {host02.test/:8020=25, host01.test/:8020=25}
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:295)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:320)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> 

[jira] [Updated] (HDFS-15685) [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails

2020-11-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-15685:
--
Labels: pull-request-available  (was: )

> [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS 
> fails
> 
>
> Key: HDFS-15685
> URL: https://issues.apache.org/jira/browse/HDFS-15685
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails after 
> [JDK-8225499|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8225499].
>  
> {noformat}
> [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.115 
> s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider
> [ERROR] 
> testResolveDomainNameUsingDNS(org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider)
>   Time elapsed: 0.964 s  <<< FAILURE!
> java.lang.AssertionError: nn1 wasn't returned: 
> {host02.test/:8020=25, host01.test/:8020=25}
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:295)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:320)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15684) EC: Call recoverLease on DFSStripedOutputStream close exception

2020-11-16 Thread Hongbing Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232712#comment-17232712
 ] 

Hongbing Wang commented on HDFS-15684:
--

add Tests in v2 patch.

> EC: Call recoverLease on DFSStripedOutputStream close exception
> ---
>
> Key: HDFS-15684
> URL: https://issues.apache.org/jira/browse/HDFS-15684
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, ec
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Major
> Attachments: HDFS-15684.001.patch, HDFS-15684.002.patch
>
>
> -HDFS-14694- add a feature that call recoverLease operation automatically 
> when DFSOutputSteam close encounters exception. When we wanted to apply this 
> feature to our cluster, we found that it does not support EC files. 
> I think this feature should take effect whether replica files or EC files. 
> This Jira proposes to make it effective when in the case of EC files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15684) EC: Call recoverLease on DFSStripedOutputStream close exception

2020-11-16 Thread Hongbing Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongbing Wang updated HDFS-15684:
-
Attachment: HDFS-15684.002.patch

> EC: Call recoverLease on DFSStripedOutputStream close exception
> ---
>
> Key: HDFS-15684
> URL: https://issues.apache.org/jira/browse/HDFS-15684
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, ec
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Major
> Attachments: HDFS-15684.001.patch, HDFS-15684.002.patch
>
>
> -HDFS-14694- add a feature that call recoverLease operation automatically 
> when DFSOutputSteam close encounters exception. When we wanted to apply this 
> feature to our cluster, we found that it does not support EC files. 
> I think this feature should take effect whether replica files or EC files. 
> This Jira proposes to make it effective when in the case of EC files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15685) [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails

2020-11-16 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-15685:


Assignee: Akira Ajisaka

> [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS 
> fails
> 
>
> Key: HDFS-15685
> URL: https://issues.apache.org/jira/browse/HDFS-15685
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>
> TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails after 
> [JDK-8225499|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8225499].
>  
> {noformat}
> [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.115 
> s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider
> [ERROR] 
> testResolveDomainNameUsingDNS(org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider)
>   Time elapsed: 0.964 s  <<< FAILURE!
> java.lang.AssertionError: nn1 wasn't returned: 
> {host02.test/:8020=25, host01.test/:8020=25}
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:295)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:320)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15685) [JDK 14] TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails

2020-11-16 Thread Akira Ajisaka (Jira)
Akira Ajisaka created HDFS-15685:


 Summary: [JDK 14] 
TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails
 Key: HDFS-15685
 URL: https://issues.apache.org/jira/browse/HDFS-15685
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Akira Ajisaka


TestConfiguredFailoverProxyProvider#testResolveDomainNameUsingDNS fails after 
[JDK-8225499|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8225499]. 
{noformat}
[ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.115 s 
<<< FAILURE! - in 
org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider
[ERROR] 
testResolveDomainNameUsingDNS(org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider)
  Time elapsed: 0.964 s  <<< FAILURE!
java.lang.AssertionError: nn1 wasn't returned: 
{host02.test/:8020=25, host01.test/:8020=25}
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:295)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestConfiguredFailoverProxyProvider.testResolveDomainNameUsingDNS(TestConfiguredFailoverProxyProvider.java:320)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15645) DatanodeManager.getNumOfDataNodes should consider the include set of host config.

2020-11-16 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232686#comment-17232686
 ] 

Xiaoqiao He commented on HDFS-15645:


[~LiJinglun], Thanks involve me here. Sorry I don't get why we should change to 
#getDatanodeListForReport here, any case do you meet? Thanks.

> DatanodeManager.getNumOfDataNodes should consider the include set of host 
> config.
> -
>
> Key: HDFS-15645
> URL: https://issues.apache.org/jira/browse/HDFS-15645
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15645.001.patch
>
>
> Currently the DatanodeManager.getNumOfDataNodes() only count the size of 
> datanodeMap. The nodes from host file's include set should also be counted. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15645) DatanodeManager.getNumOfDataNodes should consider the include set of host config.

2020-11-16 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232659#comment-17232659
 ] 

Jinglun commented on HDFS-15645:


Hi [~jianghuazhu] [~hexiaoqiao] [~ayushsaxena],  could you help reviewing this 
? Thanks !

> DatanodeManager.getNumOfDataNodes should consider the include set of host 
> config.
> -
>
> Key: HDFS-15645
> URL: https://issues.apache.org/jira/browse/HDFS-15645
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15645.001.patch
>
>
> Currently the DatanodeManager.getNumOfDataNodes() only count the size of 
> datanodeMap. The nodes from host file's include set should also be counted. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-11-16 Thread HuangTao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232608#comment-17232608
 ] 

HuangTao commented on HDFS-15240:
-

Upload  [^HDFS-15240.008.patch]  for fixing checkstyle.

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
> Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, 
> HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, 
> HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, 
> image-2020-07-16-15-56-38-608.png, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, 
> test-HDFS-15240.006.patch
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834) {code}
> Reading from DN may timeout(hold by a future(F)) and output the INFO log, but 
> the futures that contains the 

[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-11-16 Thread HuangTao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HuangTao updated HDFS-15240:

Attachment: HDFS-15240.008.patch

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
> Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, 
> HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, 
> HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, 
> image-2020-07-16-15-56-38-608.png, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, 
> test-HDFS-15240.006.patch
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834) {code}
> Reading from DN may timeout(hold by a future(F)) and output the INFO log, but 
> the futures that contains the future(F)  is cleared, 
> {code:java}
> return new