[jira] [Commented] (HDFS-15168) ABFS driver enhancement - Allow customizable translation from AAD SPNs and security groups to Linux user and group

2020-02-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044177#comment-17044177
 ] 

Hadoop QA commented on HDFS-15168:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
51s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 16s{color} | {color:orange} hadoop-tools/hadoop-azure: The patch generated 2 
new + 1 unchanged - 0 fixed = 3 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
23s{color} | {color:green} hadoop-azure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1858/2/artifact/out/Dockerfile
 |
| GITHUB PR | https://github.com/apache/hadoop/pull/1858 |
| JIRA Issue | HDFS-15168 |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 0ae39372a98f 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / dda00d3 |
| Default Java | 1.8.0_242 |
| checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1858/2/artifact/out/diff-checkstyle-hadoop-tools_hadoop-azure.txt
 |
|  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1858/2/testReport/ |
| Max. process+thread count | 448 (vs. ulimit of 5500) 

[jira] [Commented] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-24 Thread Yao Guangdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044148#comment-17044148
 ] 

Yao Guangdong commented on HDFS-15186:
--

[~ferhui], Thanks for your review and advice. I have been fixed it. Please 
check it again.

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch, HDFS-15186.002.patch, 
> HDFS-15186.003.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-24 Thread Yao Guangdong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yao Guangdong updated HDFS-15186:
-
Attachment: HDFS-15186.003.patch

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch, HDFS-15186.002.patch, 
> HDFS-15186.003.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15168) ABFS driver enhancement - Allow customizable translation from AAD SPNs and security groups to Linux user and group

2020-02-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044135#comment-17044135
 ] 

Hadoop QA commented on HDFS-15168:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 25m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
52s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 17s{color} | {color:orange} hadoop-tools/hadoop-azure: The patch generated 6 
new + 1 unchanged - 0 fixed = 7 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-tools_hadoop-azure generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
23s{color} | {color:green} hadoop-azure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 82m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1858/1/artifact/out/Dockerfile
 |
| GITHUB PR | https://github.com/apache/hadoop/pull/1858 |
| JIRA Issue | HDFS-15168 |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 4ee4629e5f6a 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / dda00d3 |
| Default Java | 1.8.0_242 |
| checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1858/1/artifact/out/diff-checkstyle-hadoop-tools_hadoop-azure.txt
 |
| javadoc | 

[jira] [Commented] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-24 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044123#comment-17044123
 ] 

Fei Hui commented on HDFS-15186:


[~yaoguangdong] Thanks for your patch
HDFS-15186.002.patch the whole fix looks good. Minor comments
{quote}
+//4. wait for decommissioning and not busy block to replicate
+Thread.sleep(3000);
{quote}
Here maybe  it will be good that GenericTestUtils.waitFor instead of it.

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch, HDFS-15186.002.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-02-24 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044114#comment-17044114
 ] 

Yang Yun commented on HDFS-15039:
-

Done, Thanks for review!

> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch, 
> HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-02-24 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15039:

Attachment: HDFS-15039.patch
Status: Patch Available  (was: Open)

> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch, 
> HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-02-24 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15039:

Status: Open  (was: Patch Available)

> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-02-24 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044110#comment-17044110
 ] 

Lisheng Sun commented on HDFS-15039:


hi [~hadoop_yangyun] It is recommended not to make unrelated changes.
{code:java}
import java.io.*;
{code}


> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-24 Thread Yao Guangdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043133#comment-17043133
 ] 

Yao Guangdong edited comment on HDFS-15186 at 2/25/20 3:08 AM:
---

[~ferhui], [~ayushtkn], [~gjhkael]  Thanks for yours patient review. I agree 
with yours point that fix it in namenode side. I have a suspicion. We can copy 
blocks from others DN in 3 replica mode when we decommission DN and the 
decommissioning DN is busy. But, we only can copy blocks from the 
decommissioning DN in ec mode if we don't reconstruct it. The time we cost in 
decommission is 69 hours (1m / 4 / 3600 = 69‬ hours) if we have 1 million 
blocks in one DN and the cost time we copy a block is one second and the hard 
limit is default 4. Which will make the speed of decommission very slow if we 
copy all blocks from decommission DN and we add decommissioning busy replica 
into live replica check? Is my comprehend right?


was (Author: yaoguangdong):
[~ferhui], [~ayushtkn], [~gjhkael]  Thanks for yours patient review. I agree 
with yours point that fix it in namenode side. I have a suspicion. We can copy 
blocks from others DN in 3 replica mode when we decommission DN and the 
decommissioning DN is busy. But, we only can copy blocks from the 
decommissioning DN in ec mode if we don't reconstruct it. The time we cost in 
decommission is 69 hours (100W / 4 / 3600 = 69‬ hours) if we have 100W blocks 
in one DN and the cost time we copy a block is one second and the hard limit is 
default 4. Which will make the speed of decommission very slow if we copy all 
blocks from decommission DN and we add decommissioning busy replica into live 
replica check? Is my comprehend right?

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch, HDFS-15186.002.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-24 Thread Yao Guangdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044085#comment-17044085
 ] 

Yao Guangdong commented on HDFS-15186:
--

[~ferhui], [~ayushtkn], [~gjhkael], [~marvelrock], I have been fixed it in 
namenode side and add a new patch HDFS-15186.002.patch. Could you have time to 
review it. Thanks.

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch, HDFS-15186.002.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-24 Thread Yao Guangdong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yao Guangdong updated HDFS-15186:
-
Attachment: HDFS-15186.002.patch

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch, HDFS-15186.002.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-02-24 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15039:

Attachment: HDFS-15039.patch
Status: Patch Available  (was: Open)

> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-02-24 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15039:

Attachment: HDFS-15039.patch

> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.patch, HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-02-24 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15039:

Status: Open  (was: Patch Available)

> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies

2020-02-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043905#comment-17043905
 ] 

Hadoop QA commented on HDFS-15154:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 28m 
50s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 56s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs generated 3 new + 581 unchanged - 
0 fixed = 584 total (was 581) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 1063 unchanged - 2 fixed = 1063 total (was 1065) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 13s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 28s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}198m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15154 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 6ceb9473a6ca 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9290040 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28836/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 

[jira] [Commented] (HDFS-15174) Optimize ReplicaCachingGetSpaceUsed by reducing unnecessary io operations

2020-02-24 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043895#comment-17043895
 ] 

Hudson commented on HDFS-15174:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17988 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17988/])
HDFS-15174. Optimize ReplicaCachingGetSpaceUsed by reducing unnecessary 
(weichiu: rev 1c5d2f1fdc40b77731bc13973876b567865888d1)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/ReplicaCachingGetSpaceUsed.java


> Optimize ReplicaCachingGetSpaceUsed by reducing unnecessary io operations
> -
>
> Key: HDFS-15174
> URL: https://issues.apache.org/jira/browse/HDFS-15174
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15174-001.patch
>
>
> Calculating the size of each block and the size of the meta file requires io 
> operation In ReplicaCachingGetSpaceUsed#refresh(). Pressure on disk 
> performance when there are many block. HDFS-14313 is intended to reduce io 
> operation. So get block size by ReplicaInfo and meta size by 
> DataChecksum#getChecksumSize().
> {code:java}
> @Override
>   protected void refresh() {
>   if (CollectionUtils.isNotEmpty(replicaInfos)) {
> for (ReplicaInfo replicaInfo : replicaInfos) {
>   if (Objects.equals(replicaInfo.getVolume().getStorageID(),
>   volume.getStorageID())) {
> dfsUsed += replicaInfo.getBlockDataLength();
> dfsUsed += replicaInfo.getMetadataLength();
> count++;
>   }
> }
>   }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-02-24 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043894#comment-17043894
 ] 

Wei-Chiu Chuang commented on HDFS-15039:


Patch has a conflict so submit a new patch and let it run through the precommit.

> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15190) HttpFS : Add Support for Storage Policy Satisfier

2020-02-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043891#comment-17043891
 ] 

Hadoop QA commented on HDFS-15190:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 19s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-httpfs: The 
patch generated 1 new + 461 unchanged - 0 fixed = 462 total (was 461) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
28s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15190 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994359/HDFS-15190.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 3c2550f6cc3e 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9290040 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28837/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-httpfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28837/testReport/ |
| Max. process+thread count | 625 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-httpfs U: 
hadoop-hdfs-project/hadoop-hdfs-httpfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28837/console |
| Powered by | Apache 

[jira] [Updated] (HDFS-15174) Optimize ReplicaCachingGetSpaceUsed by reducing unnecessary io operations

2020-02-24 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15174:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch applies cleanly in trunk branch-3.2 and branch-3.1.
Branch-2.10 will require an update.

> Optimize ReplicaCachingGetSpaceUsed by reducing unnecessary io operations
> -
>
> Key: HDFS-15174
> URL: https://issues.apache.org/jira/browse/HDFS-15174
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15174-001.patch
>
>
> Calculating the size of each block and the size of the meta file requires io 
> operation In ReplicaCachingGetSpaceUsed#refresh(). Pressure on disk 
> performance when there are many block. HDFS-14313 is intended to reduce io 
> operation. So get block size by ReplicaInfo and meta size by 
> DataChecksum#getChecksumSize().
> {code:java}
> @Override
>   protected void refresh() {
>   if (CollectionUtils.isNotEmpty(replicaInfos)) {
> for (ReplicaInfo replicaInfo : replicaInfos) {
>   if (Objects.equals(replicaInfo.getVolume().getStorageID(),
>   volume.getStorageID())) {
> dfsUsed += replicaInfo.getBlockDataLength();
> dfsUsed += replicaInfo.getMetadataLength();
> count++;
>   }
> }
>   }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15174) Optimize ReplicaCachingGetSpaceUsed by reducing unnecessary io operations

2020-02-24 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15174:
---
Fix Version/s: 3.2.2
   3.1.4
   3.3.0

> Optimize ReplicaCachingGetSpaceUsed by reducing unnecessary io operations
> -
>
> Key: HDFS-15174
> URL: https://issues.apache.org/jira/browse/HDFS-15174
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15174-001.patch
>
>
> Calculating the size of each block and the size of the meta file requires io 
> operation In ReplicaCachingGetSpaceUsed#refresh(). Pressure on disk 
> performance when there are many block. HDFS-14313 is intended to reduce io 
> operation. So get block size by ReplicaInfo and meta size by 
> DataChecksum#getChecksumSize().
> {code:java}
> @Override
>   protected void refresh() {
>   if (CollectionUtils.isNotEmpty(replicaInfos)) {
> for (ReplicaInfo replicaInfo : replicaInfos) {
>   if (Objects.equals(replicaInfo.getVolume().getStorageID(),
>   volume.getStorageID())) {
> dfsUsed += replicaInfo.getBlockDataLength();
> dfsUsed += replicaInfo.getMetadataLength();
> count++;
>   }
> }
>   }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15191) EOF when reading legacy buffer in BlockTokenIdentifier

2020-02-24 Thread Chen Liang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043864#comment-17043864
 ] 

Chen Liang edited comment on HDFS-15191 at 2/24/20 9:01 PM:


 There could be token compatibility issue though, if you only have HDFS-13617, 
but not HDFS-14611. If both changes are there, this should be fine. But even if 
HDFS-14611 is missing, I would expect a different error. Because seems the 
error happened at the very first call of {{readVLong}} when parsing the token. 
Those two Jiras only changes the behavior of tails of the block token. Also, 
even if we hit compatibility issue, I expect it to only affect the selective 
SASL feature. Will be watching this issue.


was (Author: vagarychen):
 There could be token compatibility issue though, if you only have HDFS-13617, 
but not HDFS-14611. If both changes are there, this should be fine. But even if 
HDFS-14611 is missing, I would expect a different error. Because seems the 
error happened at the very first call of {{readVLong}} when parsing the token. 
Those two Jiras only changes the behavior of tails of the block token.

> EOF when reading legacy buffer in BlockTokenIdentifier
> --
>
> Key: HDFS-15191
> URL: https://issues.apache.org/jira/browse/HDFS-15191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Steven Rand
>Priority: Major
>
> We have an HDFS client application which recently upgraded from 3.2.0 to 
> 3.2.1. After this upgrade (but not before), we sometimes see these errors 
> when this application is used with clusters still running Hadoop 2.x (more 
> specifically CDH 5.12.1):
> {code}
> WARN  [2020-02-24T00:54:32.856Z] 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing 
> remote block reader. (_sampled: true)
> java.io.EOFException:
> at java.io.DataInputStream.readByte(DataInputStream.java:272)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
> at 
> org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:227)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient.java:170)
> at 
> org.apache.hadoop.hdfs.DFSUtilClient.peerFromSocketAndKey(DFSUtilClient.java:730)
> at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2942)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:822)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:747)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:380)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:829)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2314)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246)
> at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:765)
> {code}
> We get this warning for all DataNodes with a copy of the block, so the read 
> fails.
> I haven't been able to figure out what changed between 3.2.0 and 3.2.1 to 
> cause this, but HDFS-13617 and HDFS-14611 seem 

[jira] [Commented] (HDFS-15191) EOF when reading legacy buffer in BlockTokenIdentifier

2020-02-24 Thread Chen Liang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043864#comment-17043864
 ] 

Chen Liang commented on HDFS-15191:
---

 There could be token compatibility issue though, if you only have HDFS-13617, 
but not HDFS-14611. If both changes are there, this should be fine. But even if 
HDFS-14611 is missing, I would expect a different error. Because seems the 
error happened at the very first call of {{readVLong}} when parsing the token. 
Those two Jiras only changes the behavior of tails of the block token.

> EOF when reading legacy buffer in BlockTokenIdentifier
> --
>
> Key: HDFS-15191
> URL: https://issues.apache.org/jira/browse/HDFS-15191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Steven Rand
>Priority: Major
>
> We have an HDFS client application which recently upgraded from 3.2.0 to 
> 3.2.1. After this upgrade (but not before), we sometimes see these errors 
> when this application is used with clusters still running Hadoop 2.x (more 
> specifically CDH 5.12.1):
> {code}
> WARN  [2020-02-24T00:54:32.856Z] 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing 
> remote block reader. (_sampled: true)
> java.io.EOFException:
> at java.io.DataInputStream.readByte(DataInputStream.java:272)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
> at 
> org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:227)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient.java:170)
> at 
> org.apache.hadoop.hdfs.DFSUtilClient.peerFromSocketAndKey(DFSUtilClient.java:730)
> at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2942)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:822)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:747)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:380)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:829)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2314)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246)
> at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:765)
> {code}
> We get this warning for all DataNodes with a copy of the block, so the read 
> fails.
> I haven't been able to figure out what changed between 3.2.0 and 3.2.1 to 
> cause this, but HDFS-13617 and HDFS-14611 seem related, so tagging 
> [~vagarychen] in case you have any ideas.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15191) EOF when reading legacy buffer in BlockTokenIdentifier

2020-02-24 Thread Steven Rand (Jira)
Steven Rand created HDFS-15191:
--

 Summary: EOF when reading legacy buffer in BlockTokenIdentifier
 Key: HDFS-15191
 URL: https://issues.apache.org/jira/browse/HDFS-15191
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 3.2.1
Reporter: Steven Rand


We have an HDFS client application which recently upgraded from 3.2.0 to 3.2.1. 
After this upgrade (but not before), we sometimes see these errors when this 
application is used with clusters still running Hadoop 2.x (more specifically 
CDH 5.12.1):

{code}
WARN  [2020-02-24T00:54:32.856Z] 
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing 
remote block reader. (_sampled: true)
java.io.EOFException:
at java.io.DataInputStream.readByte(DataInputStream.java:272)
at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
at 
org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
at 
org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
at 
org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:227)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient.java:170)
at 
org.apache.hadoop.hdfs.DFSUtilClient.peerFromSocketAndKey(DFSUtilClient.java:730)
at 
org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2942)
at 
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:822)
at 
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:747)
at 
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:380)
at 
org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:829)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2314)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246)
at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:765)
{code}

We get this warning for all DataNodes with a copy of the block, so the read 
fails.

I haven't been able to figure out what changed between 3.2.0 and 3.2.1 to cause 
this, but HDFS-13617 and HDFS-14611 seem related, so tagging [~vagarychen] in 
case you have any ideas.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15190) HttpFS : Add Support for Storage Policy Satisfier

2020-02-24 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043729#comment-17043729
 ] 

Íñigo Goiri commented on HDFS-15190:


Thanks [~hemanthboyina] for bringing this up.
What is the use case to run the storage policy satisfier through HttpFS?

> HttpFS : Add Support for Storage Policy Satisfier 
> --
>
> Key: HDFS-15190
> URL: https://issues.apache.org/jira/browse/HDFS-15190
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15190.001.patch
>
>
> Add support for SPS in httpfs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15190) HttpFS : Add Support for Storage Policy Satisfier

2020-02-24 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15190:
-
Attachment: HDFS-15190.001.patch
Status: Patch Available  (was: Open)

> HttpFS : Add Support for Storage Policy Satisfier 
> --
>
> Key: HDFS-15190
> URL: https://issues.apache.org/jira/browse/HDFS-15190
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15190.001.patch
>
>
> Add support for SPS in httpfs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15190) HttpFS : Add Support for Storage Policy Satisfier

2020-02-24 Thread hemanthboyina (Jira)
hemanthboyina created HDFS-15190:


 Summary: HttpFS : Add Support for Storage Policy Satisfier 
 Key: HDFS-15190
 URL: https://issues.apache.org/jira/browse/HDFS-15190
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: hemanthboyina
Assignee: hemanthboyina


Add support for SPS in httpfs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15187) CORRUPT replica mismatch between namenodes after failover

2020-02-24 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043614#comment-17043614
 ] 

Hudson commented on HDFS-15187:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17983 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17983/])
HDFS-15187. CORRUPT replica mismatch between namenodes after failover. 
(ayushsaxena: rev 7f8685f4760f1358bb30927a7da9a5041e8c39e1)
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestCorruptionWithFailover.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java


> CORRUPT replica mismatch between namenodes after failover
> -
>
> Key: HDFS-15187
> URL: https://issues.apache.org/jira/browse/HDFS-15187
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
> Attachments: HDFS-15187-01.patch, HDFS-15187-02.patch, 
> HDFS-15187-03.patch
>
>
> The corrupt replica identified by Active Namenode, isn't identified by the 
> Other Namenode, when it is failovered to Active, in case the replica is being 
> marked corrupt due to updatePipeline.
> Scenario to repro :
> 1. Create a file, while writing turn one datanode down, to trigger update 
> pipeline.
> 2. Write some more data.
> 3. Close the file.
> 4. Turn on the shutdown datanode.
> 5. The replica in the datanode will be identifed as CORRUPT and the corrupt 
> count will be 1.
> 6. Failover to other Namenode.
> 7. Wait for all pending IBR processing.
> 8. The corrupt count will not be same, and the FSCK won't show the corrupt 
> replica.
> 9. Failover back to first namenode.
> 10. Corrupt count and corrupt replica will be there.
> Both Namenodes shows different stuff.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15166) Remove redundant field fStream in ByteStringLog

2020-02-24 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043605#comment-17043605
 ] 

Hudson commented on HDFS-15166:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17982 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17982/])
HDFS-15166. Remove redundant field fStream in ByteStringLog. Contributed 
(ayushsaxena: rev 93b8f453b96470f1a6cc9ac098f4934ddd631657)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java


> Remove redundant field fStream in ByteStringLog
> ---
>
> Key: HDFS-15166
> URL: https://issues.apache.org/jira/browse/HDFS-15166
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Xieming Li
>Priority: Major
>  Labels: newbie, newbie++
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15166.000.patch
>
>
> {{ByteStringLog.fStream}} is only used in {{init()}} method and can be 
> replaced by a local variable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15166) Remove redundant field fStream in ByteStringLog

2020-02-24 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043602#comment-17043602
 ] 

Xieming Li commented on HDFS-15166:
---

Thank you for the report and review!

> Remove redundant field fStream in ByteStringLog
> ---
>
> Key: HDFS-15166
> URL: https://issues.apache.org/jira/browse/HDFS-15166
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Xieming Li
>Priority: Major
>  Labels: newbie, newbie++
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15166.000.patch
>
>
> {{ByteStringLog.fStream}} is only used in {{init()}} method and can be 
> replaced by a local variable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15111) stopStandbyServices() should log which service state it is transitioning from.

2020-02-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043596#comment-17043596
 ] 

Ayush Saxena commented on HDFS-15111:
-

[~shv] thoughts on v003 ?

> stopStandbyServices() should log which service state it is transitioning from.
> --
>
> Key: HDFS-15111
> URL: https://issues.apache.org/jira/browse/HDFS-15111
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, logging
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Xieming Li
>Priority: Major
>  Labels: newbie++
> Attachments: HDFS-15111.001.patch, HDFS-15111.002.patch, 
> HDFS-15111.003.patch
>
>
> Trying to transition Observer to Standby state. {{stopStandbyServices()}} 
> logs that it is "Stopping services started for standby state". It should be 
> "Stopping services started for observer state"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15187) CORRUPT replica mismatch between namenodes after failover

2020-02-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043594#comment-17043594
 ] 

Ayush Saxena commented on HDFS-15187:
-

Committed to trunk.
Thanx [~elgoiri] and [~vinayakumarb] for the reviews!!!

> CORRUPT replica mismatch between namenodes after failover
> -
>
> Key: HDFS-15187
> URL: https://issues.apache.org/jira/browse/HDFS-15187
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
> Attachments: HDFS-15187-01.patch, HDFS-15187-02.patch, 
> HDFS-15187-03.patch
>
>
> The corrupt replica identified by Active Namenode, isn't identified by the 
> Other Namenode, when it is failovered to Active, in case the replica is being 
> marked corrupt due to updatePipeline.
> Scenario to repro :
> 1. Create a file, while writing turn one datanode down, to trigger update 
> pipeline.
> 2. Write some more data.
> 3. Close the file.
> 4. Turn on the shutdown datanode.
> 5. The replica in the datanode will be identifed as CORRUPT and the corrupt 
> count will be 1.
> 6. Failover to other Namenode.
> 7. Wait for all pending IBR processing.
> 8. The corrupt count will not be same, and the FSCK won't show the corrupt 
> replica.
> 9. Failover back to first namenode.
> 10. Corrupt count and corrupt replica will be there.
> Both Namenodes shows different stuff.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15187) CORRUPT replica mismatch between namenodes after failover

2020-02-24 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15187:

Hadoop Flags: Reviewed
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

> CORRUPT replica mismatch between namenodes after failover
> -
>
> Key: HDFS-15187
> URL: https://issues.apache.org/jira/browse/HDFS-15187
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
> Attachments: HDFS-15187-01.patch, HDFS-15187-02.patch, 
> HDFS-15187-03.patch
>
>
> The corrupt replica identified by Active Namenode, isn't identified by the 
> Other Namenode, when it is failovered to Active, in case the replica is being 
> marked corrupt due to updatePipeline.
> Scenario to repro :
> 1. Create a file, while writing turn one datanode down, to trigger update 
> pipeline.
> 2. Write some more data.
> 3. Close the file.
> 4. Turn on the shutdown datanode.
> 5. The replica in the datanode will be identifed as CORRUPT and the corrupt 
> count will be 1.
> 6. Failover to other Namenode.
> 7. Wait for all pending IBR processing.
> 8. The corrupt count will not be same, and the FSCK won't show the corrupt 
> replica.
> 9. Failover back to first namenode.
> 10. Corrupt count and corrupt replica will be there.
> Both Namenodes shows different stuff.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15166) Remove redundant field fStream in ByteStringLog

2020-02-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043586#comment-17043586
 ] 

Ayush Saxena commented on HDFS-15166:
-

Committed to trunk, 3.2, 3.1 and 2.10
Thanx [~risyomei] for the contribution and [~shv] for the report!!!

> Remove redundant field fStream in ByteStringLog
> ---
>
> Key: HDFS-15166
> URL: https://issues.apache.org/jira/browse/HDFS-15166
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Xieming Li
>Priority: Major
>  Labels: newbie, newbie++
> Attachments: HDFS-15166.000.patch
>
>
> {{ByteStringLog.fStream}} is only used in {{init()}} method and can be 
> replaced by a local variable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15166) Remove redundant field fStream in ByteStringLog

2020-02-24 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15166:

Fix Version/s: 2.10.1
   3.2.2
   3.1.4
   3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Remove redundant field fStream in ByteStringLog
> ---
>
> Key: HDFS-15166
> URL: https://issues.apache.org/jira/browse/HDFS-15166
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Xieming Li
>Priority: Major
>  Labels: newbie, newbie++
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15166.000.patch
>
>
> {{ByteStringLog.fStream}} is only used in {{init()}} method and can be 
> replaced by a local variable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-13393) Improve OOM logging

2020-02-24 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota reassigned HDFS-13393:
-

Assignee: (was: Gabor Bota)

> Improve OOM logging
> ---
>
> Key: HDFS-13393
> URL: https://issues.apache.org/jira/browse/HDFS-13393
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, datanode
>Reporter: Wei-Chiu Chuang
>Priority: Major
>
> It is not uncommon to find "java.lang.OutOfMemoryError: unable to create new 
> native thread" errors in a HDFS cluster. Most often this happens when 
> DataNode creating DataXceiver threads, or when balancer creates threads for 
> moving blocks around.
> In most of cases, the "OOM" is a symptom of number of threads reaching system 
> limit, rather than actually running out of memory, and the current logging of 
> this message is usually misleading (suggesting this is due to insufficient 
> memory)
> How about capturing the OOM, and if it is due to "unable to create new native 
> thread", print some more helpful message like "bump your ulimit" or "take a 
> jstack of the process"?
> Even better, surface this error to make it more visible. It usually takes a 
> while for an in-depth investigation after users notice some job fails, by the 
> time the evidences may already been gone (like jstack output).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15166) Remove redundant field fStream in ByteStringLog

2020-02-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043568#comment-17043568
 ] 

Ayush Saxena commented on HDFS-15166:
-

+1

> Remove redundant field fStream in ByteStringLog
> ---
>
> Key: HDFS-15166
> URL: https://issues.apache.org/jira/browse/HDFS-15166
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Xieming Li
>Priority: Major
>  Labels: newbie, newbie++
> Attachments: HDFS-15166.000.patch
>
>
> {{ByteStringLog.fStream}} is only used in {{init()}} method and can be 
> replaced by a local variable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15166) Remove redundant field fStream in ByteStringLog

2020-02-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043459#comment-17043459
 ] 

Hadoop QA commented on HDFS-15166:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  2s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 49m 13s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:blue}0{color} | {color:blue} asflicense {color} | {color:blue}  0m 
34s{color} | {color:blue} ASF License check generated no output? {color} |
| {color:black}{color} | {color:black} {color} | {color:black}110m 41s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStorageStateRecovery |
|   | hadoop.hdfs.TestFileCreationClient |
|   | hadoop.hdfs.TestDataTransferKeepalive |
|   | hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer |
|   | hadoop.hdfs.TestDecommissionWithStriped |
|   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.TestDFSStripedInputStream |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.TestWriteBlockGetsBlockLengthHint |
|   | hadoop.hdfs.TestReadStripedFileWithDecodingDeletedData |
|   | hadoop.hdfs.server.blockmanagement.TestComputeInvalidateWork |
|   | hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | hadoop.hdfs.server.blockmanagement.TestBlockInfoStriped |
|   | hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15166 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994305/HDFS-15166.000.patch |
| Optional Tests |  dupname  

[jira] [Comment Edited] (HDFS-15098) Add SM4 encryption method for HDFS

2020-02-24 Thread Andrea (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043209#comment-17043209
 ] 

Andrea edited comment on HDFS-15098 at 2/24/20 11:52 AM:
-

This patch can be used which  hadoop version  and openssl version ?


was (Author: andrea_julianos_one):
This patch can be used whice  hadoop version  and openssl version ?

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: liusheng
>Priority: Major
> Attachments: HDFS-15098.001.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
> SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15025) Applying NVDIMM storage media to HDFS

2020-02-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043443#comment-17043443
 ] 

Hadoop QA commented on HDFS-15025:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
52s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 16 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 
59s{color} | {color:green} trunk passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 59s{color} | {color:orange} The patch fails to run checkstyle in root 
{color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
43s{color} | {color:red} hadoop-common in trunk failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
41s{color} | {color:red} hadoop-hdfs-client in trunk failed. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
24m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
34s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
57s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 23m 
20s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 23m 20s{color} | 
{color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 23m 20s{color} 
| {color:red} root in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 43s{color} | {color:orange} The patch fails to run checkstyle in root 
{color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
46s{color} | {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
48s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
41s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
0m 42s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
38s{color} | {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  9m 38s{color} 
| {color:red} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 35s{color} 
| {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 27s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:blue}0{color} | {color:blue} asflicense {color} | {color:blue}  0m 
40s{color} | {color:blue} ASF License check generated no output? {color} |
| {color:black}{color} | {color:black} {color} | {color:black}150m  5s{color} | 

[jira] [Commented] (HDFS-15111) stopStandbyServices() should log which service state it is transitioning from.

2020-02-24 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043418#comment-17043418
 ] 

Xieming Li commented on HDFS-15111:
---

Sorry for not noticing the update for a long time.

I am also ok with either way, but I prefer implementation in v003 mainly 
because " it shall maintain the present behavior", as Ayush stated. 

Using assertion impose a risk of stopping the program, though that possibility 
is nearly zero.

> stopStandbyServices() should log which service state it is transitioning from.
> --
>
> Key: HDFS-15111
> URL: https://issues.apache.org/jira/browse/HDFS-15111
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, logging
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Xieming Li
>Priority: Major
>  Labels: newbie++
> Attachments: HDFS-15111.001.patch, HDFS-15111.002.patch, 
> HDFS-15111.003.patch
>
>
> Trying to transition Observer to Standby state. {{stopStandbyServices()}} 
> logs that it is "Stopping services started for standby state". It should be 
> "Stopping services started for observer state"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15166) Remove redundant field fStream in ByteStringLog

2020-02-24 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-15166:
--
Attachment: HDFS-15166.000.patch
Status: Patch Available  (was: Open)

> Remove redundant field fStream in ByteStringLog
> ---
>
> Key: HDFS-15166
> URL: https://issues.apache.org/jira/browse/HDFS-15166
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Xieming Li
>Priority: Major
>  Labels: newbie, newbie++
> Attachments: HDFS-15166.000.patch
>
>
> {{ByteStringLog.fStream}} is only used in {{init()}} method and can be 
> replaced by a local variable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15189) update jackon-databind version

2020-02-24 Thread angerszhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

angerszhu updated HDFS-15189:
-
Description: according to 
[CVE-2020-8840|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8840], 
maybe we should update  jackson-databind to 2.9.10.3 or 2.10.x?*  (was: 
according to 
[CVE-2020-8840]([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8840]),
 maybe we should update  jackson-databind to 2.9.10.3 or 2.10.x?*)

> update jackon-databind version
> --
>
> Key: HDFS-15189
> URL: https://issues.apache.org/jira/browse/HDFS-15189
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: angerszhu
>Priority: Major
>
> according to 
> [CVE-2020-8840|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8840], 
> maybe we should update  jackson-databind to 2.9.10.3 or 2.10.x?*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15025) Applying NVDIMM storage media to HDFS

2020-02-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043375#comment-17043375
 ] 

Hadoop QA commented on HDFS-15025:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HDFS-15025 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-15025 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12989371/NVDIMM_patch%28WIP%29.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28834/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Applying NVDIMM storage media to HDFS
> -
>
> Key: HDFS-15025
> URL: https://issues.apache.org/jira/browse/HDFS-15025
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: hadoop_hdfs_hw
>Priority: Major
> Attachments: Applying NVDIMM to HDFS.pdf, NVDIMM_patch(WIP).patch
>
>
> The non-volatile memory NVDIMM is faster than SSD, it can be used 
> simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on 
> NVDIMM can not only improves the response rate of HDFS, but also ensure the 
> reliability of the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15189) update jackon-databind version

2020-02-24 Thread angerszhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043376#comment-17043376
 ] 

angerszhu commented on HDFS-15189:
--

cc [~hexiaoqiao] 

can you help to ping some one who familiar to this?

> update jackon-databind version
> --
>
> Key: HDFS-15189
> URL: https://issues.apache.org/jira/browse/HDFS-15189
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: angerszhu
>Priority: Major
>
> according to 
> [CVE-2020-8840|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8840], 
> maybe we should update  jackson-databind to 2.9.10.3 or 2.10.x?*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15189) update jackon-databind version

2020-02-24 Thread angerszhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

angerszhu updated HDFS-15189:
-
Description: according to 
[CVE-2020-8840]([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8840]),
 maybe we should update  jackson-databind to 2.9.10.3 or 2.10.x?*  (was: 
according to 
[*CVE-2020-8840]([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8840]),
 maybe we should update  jackson-databind to 2.9.10.3 or 2.10.x?*)

> update jackon-databind version
> --
>
> Key: HDFS-15189
> URL: https://issues.apache.org/jira/browse/HDFS-15189
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: angerszhu
>Priority: Major
>
> according to 
> [CVE-2020-8840]([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8840]),
>  maybe we should update  jackson-databind to 2.9.10.3 or 2.10.x?*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15189) update jackon-databind version

2020-02-24 Thread angerszhu (Jira)
angerszhu created HDFS-15189:


 Summary: update jackon-databind version
 Key: HDFS-15189
 URL: https://issues.apache.org/jira/browse/HDFS-15189
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: angerszhu


according to *CVE-2020-8840, maybe we should update  jackson-databind to 
2.9.10.3 or 2.10.x?*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15189) update jackon-databind version

2020-02-24 Thread angerszhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

angerszhu updated HDFS-15189:
-
Description: according to 
[*CVE-2020-8840]([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8840]),
 maybe we should update  jackson-databind to 2.9.10.3 or 2.10.x?*  (was: 
according to *CVE-2020-8840, maybe we should update  jackson-databind to 
2.9.10.3 or 2.10.x?*)

> update jackon-databind version
> --
>
> Key: HDFS-15189
> URL: https://issues.apache.org/jira/browse/HDFS-15189
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: angerszhu
>Priority: Major
>
> according to 
> [*CVE-2020-8840]([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8840]),
>  maybe we should update  jackson-databind to 2.9.10.3 or 2.10.x?*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15025) Applying NVDIMM storage media to HDFS

2020-02-24 Thread hadoop_hdfs_hw (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hadoop_hdfs_hw updated HDFS-15025:
--
Attachment: (was: HDFS-15025.001.patch)

> Applying NVDIMM storage media to HDFS
> -
>
> Key: HDFS-15025
> URL: https://issues.apache.org/jira/browse/HDFS-15025
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: hadoop_hdfs_hw
>Priority: Major
> Attachments: Applying NVDIMM to HDFS.pdf, NVDIMM_patch(WIP).patch
>
>
> The non-volatile memory NVDIMM is faster than SSD, it can be used 
> simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on 
> NVDIMM can not only improves the response rate of HDFS, but also ensure the 
> reliability of the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation

2020-02-24 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043367#comment-17043367
 ] 

Stephen O'Donnell commented on HDFS-14854:
--

Hi [~ayushtkn] there are quite a few factors to consider in this.

Firstly, decommission speed is mostly dictated by the speed the replication 
manager works at, and nothing about that has changed with this patch.

One thing we did attempt to do, was ensure that the blocks were shuffled so 
that if the decommissioning node has many disks, it should not pick blocks only 
from the same disk, which is what the origional monitor did. Depending on your 
settings for max-streams and work-multiplier, there may not be enough blocks 
moving at the same time to saturate a disk and therefore you would not see any 
benefit of this.

This change should also result in less load on the NN. It scans the blocks less 
often to check if they have completed replication, does a lot of work under the 
read lock rather than write lock. For maintenance, it should also result in 
nodes going into maintenance faster provided they don't need to do any 
replication.

Even if replication happens at the same speed, the new monitor should use less 
CPU and cause less lock contention on the NN, which is a good thing, but very 
hard to measure.

> Create improved decommission monitor implementation
> ---
>
> Key: HDFS-14854
> URL: https://issues.apache.org/jira/browse/HDFS-14854
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: 012_to_013_changes.diff, 
> Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, HDFS-14854.002.patch, 
> HDFS-14854.003.patch, HDFS-14854.004.patch, HDFS-14854.005.patch, 
> HDFS-14854.006.patch, HDFS-14854.007.patch, HDFS-14854.008.patch, 
> HDFS-14854.009.patch, HDFS-14854.010.patch, HDFS-14854.011.patch, 
> HDFS-14854.012.patch, HDFS-14854.013.patch, HDFS-14854.014.patch
>
>
> In HDFS-13157, we discovered a series of problems with the current 
> decommission monitor implementation, such as:
>  * Blocks are replicated sequentially disk by disk and node by node, and 
> hence the load is not spread well across the cluster
>  * Adding a node for decommission can cause the namenode write lock to be 
> held for a long time.
>  * Decommissioning nodes floods the replication queue and under replicated 
> blocks from a future node or disk failure may way for a long time before they 
> are replicated.
>  * Blocks pending replication are checked many times under a write lock 
> before they are sufficiently replicate, wasting resources
> In this Jira I propose to create a new implementation of the decommission 
> monitor that resolves these issues. As it will be difficult to prove one 
> implementation is better than another, the new implementation can be enabled 
> or disabled giving the option of the existing implementation or the new one.
> I will attach a pdf with some more details on the design and then a version 1 
> patch shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-24 Thread Yao Guangdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043360#comment-17043360
 ] 

Yao Guangdong commented on HDFS-15186:
--

[~ayushtkn], Thank you very much. I will fix it as soon as possible. :D 

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15188) Add option to set Write/Read timeout extension for different StorageType

2020-02-24 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15188:

Priority: Minor  (was: Major)

> Add option to set Write/Read timeout extension for different StorageType
> 
>
> Key: HDFS-15188
> URL: https://issues.apache.org/jira/browse/HDFS-15188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, dfsclient
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15188.patch
>
>
> Different storage types have different speeds. Especially for low-speed 
> Archive volume, errors are often reported under current timeout. Add an 
> unified solution to set options for different StorageType.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043353#comment-17043353
 ] 

Ayush Saxena commented on HDFS-15186:
-

[~yaoguangdong] he hasn't fixed this problem, he fixed a similar problem, where 
the live node was busy,
In your problem decommissioning node is busy. You can similarly fix this 
problem as HDFS-14768 did for live busy nodes here.

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-24 Thread Yao Guangdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043286#comment-17043286
 ] 

Yao Guangdong edited comment on HDFS-15186 at 2/24/20 10:06 AM:


[~ayushtkn], OK. Thanks  for your advice. [~gjhkael] had been fixed it in 
namenode side by HDFS-14768 . I think this is duplicated. You can close it.


was (Author: yaoguangdong):
[~ayushtkn], OK. Thanks  for your reply. [~gjhkael] had been fixed it in 
namenode side by HDFS-14768 . I think this is duplicated. You can close it.

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-24 Thread Yao Guangdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043286#comment-17043286
 ] 

Yao Guangdong commented on HDFS-15186:
--

[~ayushtkn], OK. Thanks  for your reply. [~gjhkael] had been fixed it in 
namenode side by HDFS-14768 . I think this is duplicated. You can close it.

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15120) Refresh BlockPlacementPolicy at runtime.

2020-02-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043268#comment-17043268
 ] 

Ayush Saxena commented on HDFS-15120:
-

Thanx [~LiJinglun] . Had a quick look. Couldn't check in full.
But can we avoid taking lock and stuff, this would be part of normal write 
flow, I am not sure, how much impact, but there would be some in taking read 
lock, though trivial, and BPP changing in runtime would be a very rare 
scenario. I don't think we should put its cost to any other basic process. Can 
we get rid of it. I think we can handle some latency in updating of BPP, when 
we reconfigure it at runtime.
Secondly, Can we directly just change the {{placementPolicies}} variable with a 
new one in {{BlockManager}} rather than changing it internally?


> Refresh BlockPlacementPolicy at runtime.
> 
>
> Key: HDFS-15120
> URL: https://issues.apache.org/jira/browse/HDFS-15120
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15120.001.patch, HDFS-15120.002.patch, 
> HDFS-15120.003.patch, HDFS-15120.004.patch, HDFS-15120.005.patch, 
> HDFS-15120.006.patch
>
>
> Now if we want to switch BlockPlacementPolicies we need to restart the 
> NameNode. It would be convenient if we can switch it at runtime. For example 
> we can switch between AvailableSpaceBlockPlacementPolicy and 
> BlockPlacementPolicyDefault as needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043266#comment-17043266
 ] 

Ayush Saxena commented on HDFS-15186:
-

Decommissioning taking time is another aspect, we can't mix it here. we can't 
directly compare EC with 3X rep, both have their pros and cons. 
May be we can track that and discuss the decommissioning taking time issue 
separately.  


> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15025) Applying NVDIMM storage media to HDFS

2020-02-24 Thread hadoop_hdfs_hw (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hadoop_hdfs_hw updated HDFS-15025:
--
  Attachment: HDFS-15025.001.patch
Release Note: Adding the new storage media NVDIMM and ALL_NVDIMM storage 
policy  on HDFS,including the test code for them
Tags: HDFS, NVDIMM
  Status: Patch Available  (was: Open)

> Applying NVDIMM storage media to HDFS
> -
>
> Key: HDFS-15025
> URL: https://issues.apache.org/jira/browse/HDFS-15025
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: hadoop_hdfs_hw
>Priority: Major
> Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, 
> NVDIMM_patch(WIP).patch
>
>
> The non-volatile memory NVDIMM is faster than SSD, it can be used 
> simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on 
> NVDIMM can not only improves the response rate of HDFS, but also ensure the 
> reliability of the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation

2020-02-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043257#comment-17043257
 ] 

Ayush Saxena commented on HDFS-14854:
-

Tried decommissioning using this with 30 Lack blocks, 3 datanodes. Sadly, this 
didn't save any time for me. To my surprise time taken with this and without 
this was exactly same. 

> Create improved decommission monitor implementation
> ---
>
> Key: HDFS-14854
> URL: https://issues.apache.org/jira/browse/HDFS-14854
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: 012_to_013_changes.diff, 
> Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, HDFS-14854.002.patch, 
> HDFS-14854.003.patch, HDFS-14854.004.patch, HDFS-14854.005.patch, 
> HDFS-14854.006.patch, HDFS-14854.007.patch, HDFS-14854.008.patch, 
> HDFS-14854.009.patch, HDFS-14854.010.patch, HDFS-14854.011.patch, 
> HDFS-14854.012.patch, HDFS-14854.013.patch, HDFS-14854.014.patch
>
>
> In HDFS-13157, we discovered a series of problems with the current 
> decommission monitor implementation, such as:
>  * Blocks are replicated sequentially disk by disk and node by node, and 
> hence the load is not spread well across the cluster
>  * Adding a node for decommission can cause the namenode write lock to be 
> held for a long time.
>  * Decommissioning nodes floods the replication queue and under replicated 
> blocks from a future node or disk failure may way for a long time before they 
> are replicated.
>  * Blocks pending replication are checked many times under a write lock 
> before they are sufficiently replicate, wasting resources
> In this Jira I propose to create a new implementation of the decommission 
> monitor that resolves these issues. As it will be difficult to prove one 
> implementation is better than another, the new implementation can be enabled 
> or disabled giving the option of the existing implementation or the new one.
> I will attach a pdf with some more details on the design and then a version 1 
> patch shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS

2020-02-24 Thread Andrea (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043209#comment-17043209
 ] 

Andrea commented on HDFS-15098:
---

This patch can be used whice  hadoop version  and openssl version ?

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: liusheng
>Priority: Major
> Attachments: HDFS-15098.001.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
> SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org