[jira] [Commented] (HDFS-15159) Prevent adding same DN multiple times in PendingReconstructionBlocks

2020-03-11 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057604#comment-17057604
 ] 

hemanthboyina commented on HDFS-15159:
--

thanks for the comment [~surendrasingh] [~elgoiri]

i have updated the patch with test case 
there was some problem with build , can you please trigger build again

> Prevent adding same DN multiple times in PendingReconstructionBlocks
> 
>
> Key: HDFS-15159
> URL: https://issues.apache.org/jira/browse/HDFS-15159
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15159.001.patch, HDFS-15159.002.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies

2020-03-11 Thread Siddharth Wagle (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057597#comment-17057597
 ] 

Siddharth Wagle commented on HDFS-15154:


10 => Updated the patch with changes suggested by [~hanishakoneru], changed the 
exception message to be simpler since we already print deprecation warning and 
have updated the hdfs-default.xml 

> Allow only hdfs superusers the ability to assign HDFS storage policies
> --
>
> Key: HDFS-15154
> URL: https://issues.apache.org/jira/browse/HDFS-15154
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Bob Cauthen
>Assignee: Siddharth Wagle
>Priority: Major
> Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, 
> HDFS-15154.03.patch, HDFS-15154.04.patch, HDFS-15154.05.patch, 
> HDFS-15154.06.patch, HDFS-15154.07.patch, HDFS-15154.08.patch, 
> HDFS-15154.09.patch, HDFS-15154.10.patch
>
>
> Please provide a way to limit only HDFS superusers the ability to assign HDFS 
> Storage Policies to HDFS directories.
> Currently, and based on Jira HDFS-7093, all storage policies can be disabled 
> cluster wide by setting the following:
> dfs.storage.policy.enabled to false
> But we need a way to allow only HDFS superusers the ability to assign an HDFS 
> Storage Policy to an HDFS directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies

2020-03-11 Thread Siddharth Wagle (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-15154:
---
Attachment: HDFS-15154.10.patch

> Allow only hdfs superusers the ability to assign HDFS storage policies
> --
>
> Key: HDFS-15154
> URL: https://issues.apache.org/jira/browse/HDFS-15154
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Bob Cauthen
>Assignee: Siddharth Wagle
>Priority: Major
> Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, 
> HDFS-15154.03.patch, HDFS-15154.04.patch, HDFS-15154.05.patch, 
> HDFS-15154.06.patch, HDFS-15154.07.patch, HDFS-15154.08.patch, 
> HDFS-15154.09.patch, HDFS-15154.10.patch
>
>
> Please provide a way to limit only HDFS superusers the ability to assign HDFS 
> Storage Policies to HDFS directories.
> Currently, and based on Jira HDFS-7093, all storage policies can be disabled 
> cluster wide by setting the following:
> dfs.storage.policy.enabled to false
> But we need a way to allow only HDFS superusers the ability to assign an HDFS 
> Storage Policy to an HDFS directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15077) Fix intermittent failure of TestDFSClientRetries#testLeaseRenewSocketTimeout

2020-03-11 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-15077:

Fix Version/s: 2.10.1

> Fix intermittent failure of TestDFSClientRetries#testLeaseRenewSocketTimeout
> 
>
> Key: HDFS-15077
> URL: https://issues.apache.org/jira/browse/HDFS-15077
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15077-branch-2.10.patch
>
>
> {{TestDFSClientRetries#testLeaseRenewSocketTimeout}} intermittently fails due 
> to race between test thread and LeaseRenewer thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15077) Fix intermittent failure of TestDFSClientRetries#testLeaseRenewSocketTimeout

2020-03-11 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057588#comment-17057588
 ] 

Masatake Iwasaki commented on HDFS-15077:
-

pushed backported patch to branch-2.10.

> Fix intermittent failure of TestDFSClientRetries#testLeaseRenewSocketTimeout
> 
>
> Key: HDFS-15077
> URL: https://issues.apache.org/jira/browse/HDFS-15077
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15077-branch-2.10.patch
>
>
> {{TestDFSClientRetries#testLeaseRenewSocketTimeout}} intermittently fails due 
> to race between test thread and LeaseRenewer thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15077) Fix intermittent failure of TestDFSClientRetries#testLeaseRenewSocketTimeout

2020-03-11 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-15077:

Attachment: HDFS-15077-branch-2.10.patch

> Fix intermittent failure of TestDFSClientRetries#testLeaseRenewSocketTimeout
> 
>
> Key: HDFS-15077
> URL: https://issues.apache.org/jira/browse/HDFS-15077
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15077-branch-2.10.patch
>
>
> {{TestDFSClientRetries#testLeaseRenewSocketTimeout}} intermittently fails due 
> to race between test thread and LeaseRenewer thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15219) DFS Client will stuck when ResponseProcessor.run throw Error

2020-03-11 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated HDFS-15219:
---
Description: 
In my case, a Tez application stucked more than 2 hours util we kill this 
applicaiton. The Reason is a task attempt stucked, becuase speculative 
execution is disable. 

Then Exception like this:
{code:java}
2020-03-11 01:23:59,141 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 10
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.FileSinkOperator|: FS[3]: 
records written - 100
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 100
2020-03-11 01:29:02,967 [FATAL] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|yarn.YarnUncaughtExceptionHandler|: Thread Thread[ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073,5,main] 
threw an Error. Shutting down now...
java.lang.NoClassDefFoundError: com/google/protobuf/TextFormat
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.toString(PipelineAck.java:253)
 at java.lang.String.valueOf(String.java:2847)
 at java.lang.StringBuilder.append(StringBuilder.java:128)
 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:737)
Caused by: java.lang.ClassNotFoundException: com.google.protobuf.TextFormat
 at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 4 more
Caused by: java.util.zip.ZipException: error reading zip file
 at java.util.zip.ZipFile.read(Native Method)
 at java.util.zip.ZipFile.access$1400(ZipFile.java:56)
 at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:679)
 at java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(ZipFile.java:415)
 at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
 at sun.misc.Resource.getBytes(Resource.java:124)
 at java.net.URLClassLoader.defineClass(URLClassLoader.java:444)
 at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
 ... 10 more
2020-03-11 01:29:02,970 [INFO] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|util.ExitUtil|: Exiting with status -1
2020-03-11 03:27:26,833 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: 
Received should die response from AM
2020-03-11 03:27:26,834 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: Asked 
to die via task heartbeat
2020-03-11 03:27:26,839 [INFO] [TaskHeartbeatThread] |task.TezTaskRunner2|: 
Attempting to abort attempt_1583335296048_917815_3_01_000704_0 due to an 
invocation of shutdownRequested

{code}
Reason is UncaughtException. When time is 01:29, a disk was error, so throw 
NoClassDefFoundError. ResponseProcessor.run only catch Exception, can't catch 
NoClassDefFoundError. So the ReponseProcessor didn't set errorState. Then 
DataStream didn't know ReponseProcessor was dead, and can't trigger 
closeResponder, so stucked in DataStream.run.

 I tested in unit-test TestDataStream.testDfsClient. When I throw 
NoClassDefFoundError in ResponseProcessor.run, the TestDataStream.testDfsClient 
will failed bacause of timeout.

I think we should catch Throwable but not Exception in ReponseProcessor.run.

 

  was:
In my case, a Tez application stucked more than 2 hours util we kill this 
applicaiton. The Reason is a task attempt stucked, becuase speculative 
execution is disable. 

Then Exception like this:
{code:java}
2020-03-11 01:23:59,141 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 10
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.FileSinkOperator|: FS[3]: 
records written - 100
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 100
2020-03-11 01:29:02,967 [FATAL] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|yarn.YarnUncaughtExceptionHandler|: Thread Thread[ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073,5,main] 
threw an Error. Shutting down now...
java.lang.NoClassDefFoundError: com/google/protobuf/TextFormat
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.toString(PipelineAck.java:253)
 at java.lang.String.valueOf(String.java:2847)
 at java.lang.StringBuilder.append(StringBuilder.java:128)
 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:737)
Caused by: java.lang.ClassNotFoundException: 

[jira] [Updated] (HDFS-15219) DFS Client will stuck when ResponseProcessor.run throw Error

2020-03-11 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated HDFS-15219:
---
Description: 
In my case, a Tez application stucked more than 2 hours util we kill this 
applicaiton. The Reason is a task attempt stucked, becuase speculative 
execution is disable. 

Then Exception like this:
{code:java}
2020-03-11 01:23:59,141 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 10
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.FileSinkOperator|: FS[3]: 
records written - 100
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 100
2020-03-11 01:29:02,967 [FATAL] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|yarn.YarnUncaughtExceptionHandler|: Thread Thread[ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073,5,main] 
threw an Error. Shutting down now...
java.lang.NoClassDefFoundError: com/google/protobuf/TextFormat
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.toString(PipelineAck.java:253)
 at java.lang.String.valueOf(String.java:2847)
 at java.lang.StringBuilder.append(StringBuilder.java:128)
 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:737)
Caused by: java.lang.ClassNotFoundException: com.google.protobuf.TextFormat
 at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 4 more
Caused by: java.util.zip.ZipException: error reading zip file
 at java.util.zip.ZipFile.read(Native Method)
 at java.util.zip.ZipFile.access$1400(ZipFile.java:56)
 at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:679)
 at java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(ZipFile.java:415)
 at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
 at sun.misc.Resource.getBytes(Resource.java:124)
 at java.net.URLClassLoader.defineClass(URLClassLoader.java:444)
 at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
 ... 10 more
2020-03-11 01:29:02,970 [INFO] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|util.ExitUtil|: Exiting with status -1
2020-03-11 03:27:26,833 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: 
Received should die response from AM
2020-03-11 03:27:26,834 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: Asked 
to die via task heartbeat
2020-03-11 03:27:26,839 [INFO] [TaskHeartbeatThread] |task.TezTaskRunner2|: 
Attempting to abort attempt_1583335296048_917815_3_01_000704_0 due to an 
invocation of shutdownRequested

{code}
Reason is UncaughtException. When time is 01:29, a disk was error, so throw 
NoClassDefFoundError. ResponseProcessor.run only catch Exception, can't catch 
NoClassDefFoundError. So the ReponseProcessor didn't set errorState. Then 
DataStream didn't know ReponseProcessor was dead, and can't trigger 
closeResponder, so stucked in DataStream.run.

 I tested in unit-test TestDataStream.testDfsClient. When I throw 
NoClassDefFoundError, the TestDataStream.testDfsClient will failed bacause of 
timeout.

I think we should catch Throwable but not Exception in ReponseProcessor.run.

 

  was:
In my case, a Tez application stucked more than 2 hours util we kill this 
applicaiton. The Reason is a task attempt stucked, becuase speculative 
execution is disable. 

Then Exception like this:
{code:java}
2020-03-11 01:23:59,141 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 10
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.FileSinkOperator|: FS[3]: 
records written - 100
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 100
2020-03-11 01:29:02,967 [FATAL] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|yarn.YarnUncaughtExceptionHandler|: Thread Thread[ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073,5,main] 
threw an Error. Shutting down now...
java.lang.NoClassDefFoundError: com/google/protobuf/TextFormat
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.toString(PipelineAck.java:253)
 at java.lang.String.valueOf(String.java:2847)
 at java.lang.StringBuilder.append(StringBuilder.java:128)
 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:737)
Caused by: java.lang.ClassNotFoundException: com.google.protobuf.TextFormat
 at 

[jira] [Updated] (HDFS-15219) DFS Client will stuck when ResponseProcessor.run throw Error

2020-03-11 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated HDFS-15219:
---
Description: 
In my case, a Tez application stucked more than 2 hours util we kill this 
applicaiton. The Reason is a task attempt stucked, becuase speculative 
execution is disable. 

Then Exception like this:
{code:java}
2020-03-11 01:23:59,141 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 10
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.FileSinkOperator|: FS[3]: 
records written - 100
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 100
2020-03-11 01:29:02,967 [FATAL] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|yarn.YarnUncaughtExceptionHandler|: Thread Thread[ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073,5,main] 
threw an Error. Shutting down now...
java.lang.NoClassDefFoundError: com/google/protobuf/TextFormat
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.toString(PipelineAck.java:253)
 at java.lang.String.valueOf(String.java:2847)
 at java.lang.StringBuilder.append(StringBuilder.java:128)
 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:737)
Caused by: java.lang.ClassNotFoundException: com.google.protobuf.TextFormat
 at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 4 more
Caused by: java.util.zip.ZipException: error reading zip file
 at java.util.zip.ZipFile.read(Native Method)
 at java.util.zip.ZipFile.access$1400(ZipFile.java:56)
 at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:679)
 at java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(ZipFile.java:415)
 at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
 at sun.misc.Resource.getBytes(Resource.java:124)
 at java.net.URLClassLoader.defineClass(URLClassLoader.java:444)
 at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
 ... 10 more
2020-03-11 01:29:02,970 [INFO] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|util.ExitUtil|: Exiting with status -1
2020-03-11 03:27:26,833 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: 
Received should die response from AM
2020-03-11 03:27:26,834 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: Asked 
to die via task heartbeat
2020-03-11 03:27:26,839 [INFO] [TaskHeartbeatThread] |task.TezTaskRunner2|: 
Attempting to abort attempt_1583335296048_917815_3_01_000704_0 due to an 
invocation of shutdownRequested

{code}
 

 

  was:
In my case, a Tez application stucked more than 2 hours util we kill this 
applicaiton. The Reason is a task attempt stucked, becuase speculative 
execution is disable. 

Then Exception like this:

{code}

2020-03-11 01:23:59,141 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 10
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.FileSinkOperator|: FS[3]: 
records written - 100
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 100
2020-03-11 01:29:02,967 [FATAL] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|yarn.YarnUncaughtExceptionHandler|: Thread Thread[ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073,5,main] 
threw an Error. Shutting down now...
java.lang.NoClassDefFoundError: com/google/protobuf/TextFormat
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.toString(PipelineAck.java:253)
 at java.lang.String.valueOf(String.java:2847)
 at java.lang.StringBuilder.append(StringBuilder.java:128)
 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:737)
Caused by: java.lang.ClassNotFoundException: com.google.protobuf.TextFormat
 at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 4 more
Caused by: java.util.zip.ZipException: error reading zip file
 at java.util.zip.ZipFile.read(Native Method)
 at 

[jira] [Updated] (HDFS-15219) DFS Client will stuck when ResponseProcessor.run throw Error

2020-03-11 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated HDFS-15219:
---
Description: 
In my case, a Tez application stucked more than 2 hours util we kill this 
applicaiton. The Reason is a task attempt stucked, becuase speculative 
execution is disable. 

Then Exception like this:
{code:java}
2020-03-11 01:23:59,141 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 10
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.FileSinkOperator|: FS[3]: 
records written - 100
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 100
2020-03-11 01:29:02,967 [FATAL] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|yarn.YarnUncaughtExceptionHandler|: Thread Thread[ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073,5,main] 
threw an Error. Shutting down now...
java.lang.NoClassDefFoundError: com/google/protobuf/TextFormat
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.toString(PipelineAck.java:253)
 at java.lang.String.valueOf(String.java:2847)
 at java.lang.StringBuilder.append(StringBuilder.java:128)
 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:737)
Caused by: java.lang.ClassNotFoundException: com.google.protobuf.TextFormat
 at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 4 more
Caused by: java.util.zip.ZipException: error reading zip file
 at java.util.zip.ZipFile.read(Native Method)
 at java.util.zip.ZipFile.access$1400(ZipFile.java:56)
 at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:679)
 at java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(ZipFile.java:415)
 at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
 at sun.misc.Resource.getBytes(Resource.java:124)
 at java.net.URLClassLoader.defineClass(URLClassLoader.java:444)
 at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
 ... 10 more
2020-03-11 01:29:02,970 [INFO] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|util.ExitUtil|: Exiting with status -1
2020-03-11 03:27:26,833 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: 
Received should die response from AM
2020-03-11 03:27:26,834 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: Asked 
to die via task heartbeat
2020-03-11 03:27:26,839 [INFO] [TaskHeartbeatThread] |task.TezTaskRunner2|: 
Attempting to abort attempt_1583335296048_917815_3_01_000704_0 due to an 
invocation of shutdownRequested

{code}
 Reason is UncaughtException. ResponseProcessor.run 

 

  was:
In my case, a Tez application stucked more than 2 hours util we kill this 
applicaiton. The Reason is a task attempt stucked, becuase speculative 
execution is disable. 

Then Exception like this:
{code:java}
2020-03-11 01:23:59,141 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 10
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.FileSinkOperator|: FS[3]: 
records written - 100
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 100
2020-03-11 01:29:02,967 [FATAL] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|yarn.YarnUncaughtExceptionHandler|: Thread Thread[ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073,5,main] 
threw an Error. Shutting down now...
java.lang.NoClassDefFoundError: com/google/protobuf/TextFormat
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.toString(PipelineAck.java:253)
 at java.lang.String.valueOf(String.java:2847)
 at java.lang.StringBuilder.append(StringBuilder.java:128)
 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:737)
Caused by: java.lang.ClassNotFoundException: com.google.protobuf.TextFormat
 at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 4 more
Caused by: java.util.zip.ZipException: error reading zip file
 at java.util.zip.ZipFile.read(Native 

[jira] [Created] (HDFS-15219) DFS Client will stuck when ResponseProcessor.run throw Error

2020-03-11 Thread zhengchenyu (Jira)
zhengchenyu created HDFS-15219:
--

 Summary: DFS Client will stuck when ResponseProcessor.run throw 
Error
 Key: HDFS-15219
 URL: https://issues.apache.org/jira/browse/HDFS-15219
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.7.3
Reporter: zhengchenyu
 Fix For: 3.2.2


In my case, a Tez application stucked more than 2 hours util we kill this 
applicaiton. The Reason is a task attempt stucked, becuase speculative 
execution is disable. 

Then Exception like this:

{code}

2020-03-11 01:23:59,141 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 10
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.FileSinkOperator|: FS[3]: 
records written - 100
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records 
read - 100
2020-03-11 01:29:02,967 [FATAL] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|yarn.YarnUncaughtExceptionHandler|: Thread Thread[ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073,5,main] 
threw an Error. Shutting down now...
java.lang.NoClassDefFoundError: com/google/protobuf/TextFormat
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.toString(PipelineAck.java:253)
 at java.lang.String.valueOf(String.java:2847)
 at java.lang.StringBuilder.append(StringBuilder.java:128)
 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:737)
Caused by: java.lang.ClassNotFoundException: com.google.protobuf.TextFormat
 at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 4 more
Caused by: java.util.zip.ZipException: error reading zip file
 at java.util.zip.ZipFile.read(Native Method)
 at java.util.zip.ZipFile.access$1400(ZipFile.java:56)
 at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:679)
 at java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(ZipFile.java:415)
 at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
 at sun.misc.Resource.getBytes(Resource.java:124)
 at java.net.URLClassLoader.defineClass(URLClassLoader.java:444)
 at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
 ... 10 more
2020-03-11 01:29:02,970 [INFO] [ResponseProcessor for block 
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073] 
|util.ExitUtil|: Exiting with status -1
2020-03-11 03:27:26,833 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: 
Received should die response from AM
2020-03-11 03:27:26,834 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: Asked 
to die via task heartbeat
2020-03-11 03:27:26,839 [INFO] [TaskHeartbeatThread] |task.TezTaskRunner2|: 
Attempting to abort attempt_1583335296048_917815_3_01_000704_0 due to an 
invocation of shutdownRequested

{code}

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15077) Fix intermittent failure of TestDFSClientRetries#testLeaseRenewSocketTimeout

2020-03-11 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057516#comment-17057516
 ] 

Masatake Iwasaki commented on HDFS-15077:
-

[~Jim_Brennan] I'm going to backport this to branch-2.10. We can not use lambda 
since branch-2.10 does not drop Java 7 support.

> Fix intermittent failure of TestDFSClientRetries#testLeaseRenewSocketTimeout
> 
>
> Key: HDFS-15077
> URL: https://issues.apache.org/jira/browse/HDFS-15077
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
>
> {{TestDFSClientRetries#testLeaseRenewSocketTimeout}} intermittently fails due 
> to race between test thread and LeaseRenewer thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies

2020-03-11 Thread Siddharth Wagle (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057501#comment-17057501
 ] 

Siddharth Wagle commented on HDFS-15154:


Since we are logging deprecation already, can we just change the warning to 
this:
{noformat}
Failed to change storage policy satisfier as storage policies have been 
disabled.
{noformat}

Rather than the cryptic message?

> Allow only hdfs superusers the ability to assign HDFS storage policies
> --
>
> Key: HDFS-15154
> URL: https://issues.apache.org/jira/browse/HDFS-15154
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Bob Cauthen
>Assignee: Siddharth Wagle
>Priority: Major
> Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, 
> HDFS-15154.03.patch, HDFS-15154.04.patch, HDFS-15154.05.patch, 
> HDFS-15154.06.patch, HDFS-15154.07.patch, HDFS-15154.08.patch, 
> HDFS-15154.09.patch
>
>
> Please provide a way to limit only HDFS superusers the ability to assign HDFS 
> Storage Policies to HDFS directories.
> Currently, and based on Jira HDFS-7093, all storage policies can be disabled 
> cluster wide by setting the following:
> dfs.storage.policy.enabled to false
> But we need a way to allow only HDFS superusers the ability to assign an HDFS 
> Storage Policy to an HDFS directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies

2020-03-11 Thread Siddharth Wagle (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057494#comment-17057494
 ] 

Siddharth Wagle commented on HDFS-15154:


Thanks for the review [~hanishakoneru], I will make those changes 

> Allow only hdfs superusers the ability to assign HDFS storage policies
> --
>
> Key: HDFS-15154
> URL: https://issues.apache.org/jira/browse/HDFS-15154
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Bob Cauthen
>Assignee: Siddharth Wagle
>Priority: Major
> Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, 
> HDFS-15154.03.patch, HDFS-15154.04.patch, HDFS-15154.05.patch, 
> HDFS-15154.06.patch, HDFS-15154.07.patch, HDFS-15154.08.patch, 
> HDFS-15154.09.patch
>
>
> Please provide a way to limit only HDFS superusers the ability to assign HDFS 
> Storage Policies to HDFS directories.
> Currently, and based on Jira HDFS-7093, all storage policies can be disabled 
> cluster wide by setting the following:
> dfs.storage.policy.enabled to false
> But we need a way to allow only HDFS superusers the ability to assign an HDFS 
> Storage Policy to an HDFS directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies

2020-03-11 Thread Hanisha Koneru (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057484#comment-17057484
 ] 

Hanisha Koneru commented on HDFS-15154:
---

[~swagle], patch LGTM overall. Few comments:

* In StoragePolicySatisfyManager also we should call 
DFSUtil#getDfsStoragePolicySetting in case the deprecated config is set. 
* We might have to change the following log messages to indicate that either 
DFS_STORAGE_POLICIES_ENABLED_KEY is Disabled or DFS_STORAGE_POLICY_ENABLED_KEY 
is set to false.
{code:java}
LOG.info("Failed to change storage policy satisfier as {} set to {}.",
 DFSConfigKeys.DFS_STORAGE_POLICIES_ENABLED_KEY,
 DFSConfigKeys.DfsStoragePolicySetting.DISABLED);{code}
* We could probably add a new method in DFSUtil to check if StoragePolicy is 
enabled as that check is done in multiple places.

> Allow only hdfs superusers the ability to assign HDFS storage policies
> --
>
> Key: HDFS-15154
> URL: https://issues.apache.org/jira/browse/HDFS-15154
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Bob Cauthen
>Assignee: Siddharth Wagle
>Priority: Major
> Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, 
> HDFS-15154.03.patch, HDFS-15154.04.patch, HDFS-15154.05.patch, 
> HDFS-15154.06.patch, HDFS-15154.07.patch, HDFS-15154.08.patch, 
> HDFS-15154.09.patch
>
>
> Please provide a way to limit only HDFS superusers the ability to assign HDFS 
> Storage Policies to HDFS directories.
> Currently, and based on Jira HDFS-7093, all storage policies can be disabled 
> cluster wide by setting the following:
> dfs.storage.policy.enabled to false
> But we need a way to allow only HDFS superusers the ability to assign an HDFS 
> Storage Policy to an HDFS directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15159) Prevent adding same DN multiple times in PendingReconstructionBlocks

2020-03-11 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057475#comment-17057475
 ] 

Hadoop QA commented on HDFS-15159:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  9s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  0s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}134m 12s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
49s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}196m 25s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.TestDFSClientExcludedNodes |
|   | hadoop.hdfs.web.TestWebHDFS |
|   | hadoop.hdfs.TestDFSInotifyEventInputStream |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.TestErasureCodingExerciseAPIs |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM |
|   | hadoop.hdfs.TestFileChecksum |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot |
|   | hadoop.hdfs.TestWriteReadStripedFile |
|   | hadoop.hdfs.TestDFSInputStreamBlockLocations |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.TestEncryptionZones |
|   | hadoop.hdfs.server.namenode.sps.TestStoragePolicySatisfierWithStripedFile 
|
|   | hadoop.hdfs.server.namenode.TestDefaultBlockPlacementPolicy |
|   | hadoop.hdfs.TestDecommission |
|   | hadoop.hdfs.server.namenode.TestFSEditLogLoader |
|   | hadoop.hdfs.TestDecommissionWithStriped |
|   | hadoop.hdfs.server.namenode.ha.TestHASafeMode |
|   | hadoop.hdfs.TestErasureCodingPolicyWithSnapshot |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| 

[jira] [Commented] (HDFS-15077) Fix intermittent failure of TestDFSClientRetries#testLeaseRenewSocketTimeout

2020-03-11 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057423#comment-17057423
 ] 

Jim Brennan commented on HDFS-15077:


[~iwasakims], [~aajisaka] we have seen this failure (rarely) in our automated 
tests for our internal branch-2.10 build.  I believe the patch applies cleanly. 
 Could we get it pulled back to branch-2.10?


> Fix intermittent failure of TestDFSClientRetries#testLeaseRenewSocketTimeout
> 
>
> Key: HDFS-15077
> URL: https://issues.apache.org/jira/browse/HDFS-15077
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
>
> {{TestDFSClientRetries#testLeaseRenewSocketTimeout}} intermittently fails due 
> to race between test thread and LeaseRenewer thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15159) Prevent adding same DN multiple times in PendingReconstructionBlocks

2020-03-11 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15159:
-
Attachment: HDFS-15159.002.patch

> Prevent adding same DN multiple times in PendingReconstructionBlocks
> 
>
> Key: HDFS-15159
> URL: https://issues.apache.org/jira/browse/HDFS-15159
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15159.001.patch, HDFS-15159.002.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12136) BlockSender performance regression due to volume scanner edge case

2020-03-11 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-12136:
---
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Resolved by HDFS-11187 

> BlockSender performance regression due to volume scanner edge case
> --
>
> Key: HDFS-12136
> URL: https://issues.apache.org/jira/browse/HDFS-12136
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch
>
>
> HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan 
> by reading the last checksum of finalized blocks within the {{BlockSender}} 
> ctor.  Unfortunately it's holding the exclusive dataset lock to open and read 
> the metafile multiple times  Block sender instantiation becomes serialized.
> Performance completely collapses under heavy disk i/o utilization or high 
> xceiver activity.  Ex. lost node replication, balancing, or decommissioning.  
> The xceiver threads congest creating block senders and impair the heartbeat 
> processing that is contending for the same lock.  Combined with other lock 
> contention issues, pipelines break and nodes sporadically go dead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14338) TestPread timeouts in branch-2.8

2020-03-11 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14338:
---
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Branch-2.8 is EOL. Resolve as Won't Fix.

> TestPread timeouts in branch-2.8
> 
>
> Key: HDFS-14338
> URL: https://issues.apache.org/jira/browse/HDFS-14338
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HDFS-14338-001.patch, 
> HDFS-14338-branch-2.8-001-testing.patch, HDFS-14338-branch-2.8-001.patch
>
>
> TestPread timeouts in branch-2.8.
> {noformat}
> ---
>  T E S T S
> ---
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support 
> was removed in 8.0
> Running org.apache.hadoop.hdfs.TestPread
> Results :
> Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-03-11 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057263#comment-17057263
 ] 

Wei-Chiu Chuang commented on HDFS-15039:


The patch doesn't apply any more. Updated the patch to resolve conflicts.

> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.006.patch, HDFS-15039.patch, 
> HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-03-11 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15039:
---
Attachment: HDFS-15039.006.patch

> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.006.patch, HDFS-15039.patch, 
> HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13351) Revert HDFS-11156 from branch-2/branch-2.8

2020-03-11 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-13351:
---
Target Version/s: 2.10.1
  Labels: release-blocker  (was: )

> Revert HDFS-11156 from branch-2/branch-2.8
> --
>
> Key: HDFS-13351
> URL: https://issues.apache.org/jira/browse/HDFS-13351
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: webhdfs
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: release-blocker
> Attachments: HDFS-13351-branch-2.001.patch, 
> HDFS-13351-branch-2.002.patch, HDFS-13351-branch-2.003.patch
>
>
> Per discussion in HDFS-11156, lets revert the change from branch-2 and 
> branch-2.8. New patch can be tracked in HDFS-12459 .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-03-11 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057254#comment-17057254
 ] 

Wei-Chiu Chuang commented on HDFS-15039:


+1

> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch, 
> HDFS-15039.patch, HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14820) The default 8KB buffer of BlockReaderRemote#newBlockReader#BufferedOutputStream is too big

2020-03-11 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057249#comment-17057249
 ] 

Wei-Chiu Chuang commented on HDFS-14820:


I am +1 and will commit by end of week unless there's objection to my 
explanation above. Thanks.

>  The default 8KB buffer of 
> BlockReaderRemote#newBlockReader#BufferedOutputStream is too big
> ---
>
> Key: HDFS-14820
> URL: https://issues.apache.org/jira/browse/HDFS-14820
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14820.001.patch, HDFS-14820.002.patch, 
> HDFS-14820.003.patch
>
>
> this issue is similar to HDFS-14535.
> {code:java}
> public static BlockReader newBlockReader(String file,
> ExtendedBlock block,
> Token blockToken,
> long startOffset, long len,
> boolean verifyChecksum,
> String clientName,
> Peer peer, DatanodeID datanodeID,
> PeerCache peerCache,
> CachingStrategy cachingStrategy,
> int networkDistance) throws IOException {
>   // in and out will be closed when sock is closed (by the caller)
>   final DataOutputStream out = new DataOutputStream(new BufferedOutputStream(
>   peer.getOutputStream()));
>   new Sender(out).readBlock(block, blockToken, clientName, startOffset, len,
>   verifyChecksum, cachingStrategy);
> }
> public BufferedOutputStream(OutputStream out) {
> this(out, 8192);
> }
> {code}
> Sender#readBlock parameter( block,blockToken, clientName, startOffset, len, 
> verifyChecksum, cachingStrategy) could not use such a big buffer.
> So i think it should reduce BufferedOutputStream buffer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-03-11 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057033#comment-17057033
 ] 

Hadoop QA commented on HDFS-15160:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 11s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 25s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}154m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestFsck |
|   | hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.7 Server=19.03.7 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15160 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12996405/HDFS-15160.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a2221593bbe8 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / cf9cf83 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28927/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28927/testReport/ |
| Max. process+thread count | 3478 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 

[jira] [Commented] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-11 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056932#comment-17056932
 ] 

Hadoop QA commented on HDFS-15216:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 19 unchanged - 1 fixed = 19 total (was 20) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m 37s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}155m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.7 Server=19.03.7 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15216 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12996314/HDFS-15216.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f15c0de2c4ab 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / cf9cf83 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28926/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 

[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-03-11 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056905#comment-17056905
 ] 

Stephen O'Donnell commented on HDFS-15160:
--

Uploaded v003 switching DataNode#transferReplicaForPipelineRecovery to the read 
lock.

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-03-11 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-15160:
-
Attachment: HDFS-15160.003.patch

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-11 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056786#comment-17056786
 ] 

Stephen O'Donnell commented on HDFS-15216:
--

+1 on this change. Unit test failures seem unrelated (unable to create native 
thread errors). I have triggered the build job again to be sure.

> Wrong Use Case of -showprogress in fsck 
> 
>
> Key: HDFS-15216
> URL: https://issues.apache.org/jira/browse/HDFS-15216
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15216.001.patch
>
>
> *-showprogress* is deprecated and Progress is now shown by default but fsck 
> --help shows incorrect use case for the same 
>  
> Usage: hdfs fsck  [-list-corruptfileblocks | [-move | -delete | 
> -openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | 
> -upgradedomains [-includeSnapshots] [-showprogress] [-storagepolicies] 
> [-maintenance] [-blockId ]
>   start checking from this path
> h4. 
>  -move move corrupted files to /lost+found
>  -delete delete corrupted files
>  -files print out files being checked
>  -openforwrite print out files opened for write
>  -includeSnapshots include snapshot data if the given path indicates a 
> snapshottable directory or there are snapshottable directories under it
>  -list-corruptfileblocks print out list of missing blocks and files they 
> belong to
>  -files -blocks print out block report
>  -files -blocks -locations print out locations for every block
>  -files -blocks -racks print out network topology for data-node locations
>  -files -blocks -replicaDetails print out each replica details
>  -files -blocks -upgradedomains print out upgrade domains for every block
>  -storagepolicies print out storage policy summary for the blocks
>  -maintenance print out maintenance state node details
>  *-showprogress show progress in output. Default is OFF (no progress)*
>  -blockId print out which file this blockId belongs to, locations (nodes, 
> racks) of this block, and other diagnostics info (under replicated, corrupted 
> or not, etc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org