[jira] [Commented] (HDFS-12412) Remove ErasureCodingWorker.stripedReadPool

2017-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163970#comment-16163970
 ] 

Hadoop QA commented on HDFS-12412:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 415 unchanged - 2 fixed = 415 total (was 417) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 38s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}112m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestPread |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.tools.TestDFSAdminWithHA |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 |
|   | hadoop.hdfs.qjournal.TestNNWithQJM |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure190 |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure020 |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.TestReconstructStripedFile |
| Timed out junit tests | 
org.apache.hadoop.hdfs.TestReadStripedFileWithDecodingCorruptData |
|   | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyInProgressTail |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestXAttrsWithHA |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12412 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12886518/HDFS-12412.00.patch |
| Optional Tests |  asflicense  

[jira] [Commented] (HDFS-12412) Remove ErasureCodingWorker.stripedReadPool

2017-09-12 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163958#comment-16163958
 ] 

Andrew Wang commented on HDFS-12412:


I also filed and posted a patch on HDFS-12438 for the config renaming I 
mentioned.

> Remove ErasureCodingWorker.stripedReadPool
> --
>
> Key: HDFS-12412
> URL: https://issues.apache.org/jira/browse/HDFS-12412
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha3
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-12412.00.patch, HDFS-12412.01.patch
>
>
> In {{ErasureCodingWorker}}, it uses {{stripedReconstructionPool}} to schedule 
> the EC recovery tasks, while uses {{stripedReadPool}} for the reader threads 
> in each recovery task.  We only need one of them to throttle the speed of 
> recovery process, because each EC recovery task has a fix number of source 
> readers (i.e., 3 for RS(3,2)). And because of the findings in HDFS-12044, the 
> speed of EC recovery can be throttled by {{strippedReconstructionPool}} with 
> {{xmitsInProgress}}. 
> Moreover, keeping {{stripedReadPool}} makes customer difficult to understand 
> and calculate the right balance between 
> {{dfs.datanode.ec.reconstruction.stripedread.threads}}, 
> {{dfs.datanode.ec.reconstruction.stripedblock.threads.size}} and 
> {{maxReplicationStreams}}.  For example, a small {{stripread.threads}} 
> (comparing to which {{reconstruction.threads.size}} implies), will 
> unnecessarily limit the speed of recovery, which leads to larger MTTR. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12412) Remove ErasureCodingWorker.stripedReadPool

2017-09-12 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163953#comment-16163953
 ] 

Andrew Wang commented on HDFS-12412:


+1 LGTM pending precommit, thanks Eddy.

> Remove ErasureCodingWorker.stripedReadPool
> --
>
> Key: HDFS-12412
> URL: https://issues.apache.org/jira/browse/HDFS-12412
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha3
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-12412.00.patch, HDFS-12412.01.patch
>
>
> In {{ErasureCodingWorker}}, it uses {{stripedReconstructionPool}} to schedule 
> the EC recovery tasks, while uses {{stripedReadPool}} for the reader threads 
> in each recovery task.  We only need one of them to throttle the speed of 
> recovery process, because each EC recovery task has a fix number of source 
> readers (i.e., 3 for RS(3,2)). And because of the findings in HDFS-12044, the 
> speed of EC recovery can be throttled by {{strippedReconstructionPool}} with 
> {{xmitsInProgress}}. 
> Moreover, keeping {{stripedReadPool}} makes customer difficult to understand 
> and calculate the right balance between 
> {{dfs.datanode.ec.reconstruction.stripedread.threads}}, 
> {{dfs.datanode.ec.reconstruction.stripedblock.threads.size}} and 
> {{maxReplicationStreams}}.  For example, a small {{stripread.threads}} 
> (comparing to which {{reconstruction.threads.size}} implies), will 
> unnecessarily limit the speed of recovery, which leads to larger MTTR. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12412) Remove ErasureCodingWorker.stripedReadPool

2017-09-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162208#comment-16162208
 ] 

Andrew Wang commented on HDFS-12412:


While I was looking at this a little more, I noticed that 
"dfs.datanode.ec.reconstruction.stripedblock.threads.size" is named poorly. 
Could you do another JIRA to rename this to drop the ".size" suffix, for 
consistency with other similar config keys that control the size of a thread 
pool?

It'd also be good to add a release note to this JIRA to point users at what 
keys to use for tuning reconstruction performance. If you want to update the EC 
user docs as part of this JIRA too, that'd also be good.

> Remove ErasureCodingWorker.stripedReadPool
> --
>
> Key: HDFS-12412
> URL: https://issues.apache.org/jira/browse/HDFS-12412
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha3
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-12412.00.patch
>
>
> In {{ErasureCodingWorker}}, it uses {{stripedReconstructionPool}} to schedule 
> the EC recovery tasks, while uses {{stripedReadPool}} for the reader threads 
> in each recovery task.  We only need one of them to throttle the speed of 
> recovery process, because each EC recovery task has a fix number of source 
> readers (i.e., 3 for RS(3,2)). And because of the findings in HDFS-12044, the 
> speed of EC recovery can be throttled by {{strippedReconstructionPool}} with 
> {{xmitsInProgress}}. 
> Moreover, keeping {{stripedReadPool}} makes customer difficult to understand 
> and calculate the right balance between 
> {{dfs.datanode.ec.reconstruction.stripedread.threads}}, 
> {{dfs.datanode.ec.reconstruction.stripedblock.threads.size}} and 
> {{maxReplicationStreams}}.  For example, a small {{stripread.threads}} 
> (comparing to which {{reconstruction.threads.size}} implies), will 
> unnecessarily limit the speed of recovery, which leads to larger MTTR. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12412) Remove ErasureCodingWorker.stripedReadPool

2017-09-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162204#comment-16162204
 ] 

Andrew Wang commented on HDFS-12412:


Thanks for working on this Eddy. I did a grep for the removed config key:

{noformat}
-> % ag "stripedread.threads" 
hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md
140:  1. `dfs.datanode.ec.reconstruction.stripedread.threads` - Number of 
concurrent reader threads. Default value is 20 threads.

hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
3052:  dfs.datanode.ec.reconstruction.stripedread.threads
{noformat}

We need to update these as well. Otherwise, +1 LGTM!

> Remove ErasureCodingWorker.stripedReadPool
> --
>
> Key: HDFS-12412
> URL: https://issues.apache.org/jira/browse/HDFS-12412
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha3
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-12412.00.patch
>
>
> In {{ErasureCodingWorker}}, it uses {{stripedReconstructionPool}} to schedule 
> the EC recovery tasks, while uses {{stripedReadPool}} for the reader threads 
> in each recovery task.  We only need one of them to throttle the speed of 
> recovery process, because each EC recovery task has a fix number of source 
> readers (i.e., 3 for RS(3,2)). And because of the findings in HDFS-12044, the 
> speed of EC recovery can be throttled by {{strippedReconstructionPool}} with 
> {{xmitsInProgress}}. 
> Moreover, keeping {{stripedReadPool}} makes customer difficult to understand 
> and calculate the right balance between 
> {{dfs.datanode.ec.reconstruction.stripedread.threads}}, 
> {{dfs.datanode.ec.reconstruction.stripedblock.threads.size}} and 
> {{maxReplicationStreams}}.  For example, a small {{stripread.threads}} 
> (comparing to which {{reconstruction.threads.size}} implies), will 
> unnecessarily limit the speed of recovery, which leads to larger MTTR. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12412) Remove ErasureCodingWorker.stripedReadPool

2017-09-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161724#comment-16161724
 ] 

Andrew Wang commented on HDFS-12412:


Agree, nice idea here Eddy. We can always add more tunables later.

> Remove ErasureCodingWorker.stripedReadPool
> --
>
> Key: HDFS-12412
> URL: https://issues.apache.org/jira/browse/HDFS-12412
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha3
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>  Labels: hdfs-ec-3.0-nice-to-have
>
> In {{ErasureCodingWorker}}, it uses {{stripedReconstructionPool}} to schedule 
> the EC recovery tasks, while uses {{stripedReadPool}} for the reader threads 
> in each recovery task.  We only need one of them to throttle the speed of 
> recovery process, because each EC recovery task has a fix number of source 
> readers (i.e., 3 for RS(3,2)). And because of the findings in HDFS-12044, the 
> speed of EC recovery can be throttled by {{strippedReconstructionPool}} with 
> {{xmitsInProgress}}. 
> Moreover, keeping {{stripedReadPool}} makes customer difficult to understand 
> and calculate the right balance between 
> {{dfs.datanode.ec.reconstruction.stripedread.threads}}, 
> {{dfs.datanode.ec.reconstruction.stripedblock.threads.size}} and 
> {{maxReplicationStreams}}.  For example, a small {{stripread.threads}} 
> (comparing to which {{reconstruction.threads.size}} implies), will 
> unnecessarily limit the speed of recovery, which leads to larger MTTR. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12412) Remove ErasureCodingWorker.stripedReadPool

2017-09-08 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16159682#comment-16159682
 ] 

Kai Zheng commented on HDFS-12412:
--

Thanks Eddy for the ping. 

The idea to remove the striped read pool and reuse the same reconstruction pool 
sounds good to me, since given the later and the most often used erasure codec, 
we can roughly estimate the striped read threads need. We can also simplify the 
configuration and codes.

So as you said, you probably have the idea how to reduce the recommended value 
or default value and validate the configuration value for the reconstruction 
pool size, assuming you know how many concurrent reconstruction tasks to be 
performed and so on.

Less configuration with reasonable defaults would make the brand feature more 
easier to use. When needed, we can fine-tune and add more later.

> Remove ErasureCodingWorker.stripedReadPool
> --
>
> Key: HDFS-12412
> URL: https://issues.apache.org/jira/browse/HDFS-12412
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha3
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>
> In {{ErasureCodingWorker}}, it uses {{stripedReconstructionPool}} to schedule 
> the EC recovery tasks, while uses {{stripedReadPool}} for the reader threads 
> in each recovery task.  We only need one of them to throttle the speed of 
> recovery process, because each EC recovery task has a fix number of source 
> readers (i.e., 3 for RS(3,2)). And because of the findings in HDFS-12044, the 
> speed of EC recovery can be throttled by {{strippedReconstructionPool}} with 
> {{xmitsInProgress}}. 
> Moreover, keeping {{stripedReadPool}} makes customer difficult to understand 
> and calculate the right balance between 
> {{dfs.datanode.ec.reconstruction.stripedread.threads}}, 
> {{dfs.datanode.ec.reconstruction.stripedblock.threads.size}} and 
> {{maxReplicationStreams}}.  For example, a small {{stripread.threads}} 
> (comparing to which {{reconstruction.threads.size}} implies), will 
> unnecessarily limit the speed of recovery, which leads to larger MTTR. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12412) Remove ErasureCodingWorker.stripedReadPool

2017-09-08 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16159488#comment-16159488
 ] 

Lei (Eddy) Xu commented on HDFS-12412:
--

Ping [~drankye] [~Sammi] [~andrew.wang] for the inputs.

> Remove ErasureCodingWorker.stripedReadPool
> --
>
> Key: HDFS-12412
> URL: https://issues.apache.org/jira/browse/HDFS-12412
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha3
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>
> In {{ErasureCodingWorker}}, it uses {{stripedReconstructionPool}} to schedule 
> the EC recovery tasks, while uses {{stripedReadPool}} for the reader threads 
> in each recovery task.  We only need one of them to throttle the speed of 
> recovery process, because each EC recovery task has a fix number of source 
> readers (i.e., 3 for RS(3,2)). And because of the findings in HDFS-12044, the 
> speed of EC recovery can be throttled by {{strippedReconstructionPool}} with 
> {{xmitsInProgress}}. 
> Moreover, keeping {{stripedReadPool}} makes customer difficult to understand 
> and calculate the right balance between 
> {{dfs.datanode.ec.reconstruction.stripedread.threads}}, 
> {{dfs.datanode.ec.reconstruction.stripedblock.threads.size}} and 
> {{maxReplicationStreams}}.  For example, a small {{stripread.threads}} 
> (comparing to which {{reconstruction.threads.size}} implies), will 
> unnecessarily limit the speed of recovery, which leads to larger MTTR. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org