[jira] [Commented] (HDFS-15340) RBF: Balance data across federation namespaces with DistCp and snapshot diff / Step 1: The State Machine(BalanceProcedureScheduler)

2020-05-10 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104072#comment-17104072
 ] 

Jinglun commented on HDFS-15340:


Upload v04, fix checkstyle.

> RBF: Balance data across federation namespaces with DistCp and snapshot diff 
> / Step 1: The State Machine(BalanceProcedureScheduler)
> ---
>
> Key: HDFS-15340
> URL: https://issues.apache.org/jira/browse/HDFS-15340
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15340.001.patch, HDFS-15340.002.patch, 
> HDFS-15340.003.patch, HDFS-15340.004.patch
>
>
> Patch in HDFS-15294 is too big to review so we split it into 2 patches. This 
> is the first one. Detail can be found at HDFS-15294.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15340) RBF: Balance data across federation namespaces with DistCp and snapshot diff / Step 1: The State Machine(BalanceProcedureScheduler)

2020-05-10 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-15340:
---
Attachment: HDFS-15340.004.patch

> RBF: Balance data across federation namespaces with DistCp and snapshot diff 
> / Step 1: The State Machine(BalanceProcedureScheduler)
> ---
>
> Key: HDFS-15340
> URL: https://issues.apache.org/jira/browse/HDFS-15340
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15340.001.patch, HDFS-15340.002.patch, 
> HDFS-15340.003.patch, HDFS-15340.004.patch
>
>
> Patch in HDFS-15294 is too big to review so we split it into 2 patches. This 
> is the first one. Detail can be found at HDFS-15294.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15340) RBF: Balance data across federation namespaces with DistCp and snapshot diff / Step 1: The State Machine(BalanceProcedureScheduler)

2020-05-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104065#comment-17104065
 ] 

Hadoop QA commented on HDFS-15340:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
55s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 38s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
11s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 17s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-rbf: The patch 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 42s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
10s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29263/artifact/out/Dockerfile
 |
| JIRA Issue | HDFS-15340 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13002576/HDFS-15340.003.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 09cbeea788d0 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / aab9e0b16ec |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29263/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 

[jira] [Commented] (HDFS-15349) Adapt the netty4 ARM support switch(YARN-9898)

2020-05-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104055#comment-17104055
 ] 

Hadoop QA commented on HDFS-15349:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 22m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
40m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
12s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
11s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
12s{color} | {color:red} hadoop-hdfs-project in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 12s{color} 
| {color:red} hadoop-hdfs-project in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
11s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  0m 
12s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
11s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
12s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 12s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 11s{color} 
| {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:blue}0{color} | {color:blue} asflicense {color} | {color:blue}  0m 
14s{color} | {color:blue} ASF License check generated no output? {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29262/artifact/out/Dockerfile
 |
| JIRA Issue | HDFS-15349 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13002574/HDFS-15349.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient xml |
| uname | Linux 2e6777fd7e26 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | 

[jira] [Updated] (HDFS-15293) Relax the condition for accepting a fsimage when receiving a checkpoint

2020-05-10 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15293:
-
Issue Type: Bug  (was: Improvement)

> Relax the condition for accepting a fsimage when receiving a checkpoint 
> 
>
> Key: HDFS-15293
> URL: https://issues.apache.org/jira/browse/HDFS-15293
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
>  Labels: multi-sbnn, release-blocker
> Attachments: HDFS-15293.001.patch
>
>
> HDFS-12979 introduced the logic that, if ANN sees consecutive fs image upload 
> from Standby with a small delta comparing to previous fsImage. ANN would 
> reject this image. This is to avoid overly frequent fsImage in case of when 
> there are multiple Standby node. However this check could be too stringent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15287) HDFS rollingupgrade prepare never finishes

2020-05-10 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15287:
-
Target Version/s: 3.4.0  (was: 3.3.0, 2.10.1)
Priority: Major  (was: Blocker)

Hi [~kihwal], would you check whether the patch in HDFS-15293 fixes the 
side-effect or not?

I lowered the priority and set release-blocker label to the follow-up jira 
(HDFS-15293).

> HDFS rollingupgrade prepare never finishes
> --
>
> Key: HDFS-15287
> URL: https://issues.apache.org/jira/browse/HDFS-15287
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0, 3.3.0
>Reporter: Kihwal Lee
>Priority: Major
>
> After HDFS-12979, the prepare step of rolling upgrade does not work. This is 
> because it added additional check for sufficient time passing since last 
> checkpoint. Since RU rollback image creation and upload can happen any time, 
> uploading of rollback image never succeeds. For a new cluster deployed for 
> testing, it might work since it never checkpointed before.
> It was found that this check is disabled for unit tests, defeating the very 
> purpose of testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15293) Relax the condition for accepting a fsimage when receiving a checkpoint

2020-05-10 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15293:
-
Target Version/s: 3.3.0, 3.1.4, 3.2.2, 2.10.1  (was: 3.3.0)

> Relax the condition for accepting a fsimage when receiving a checkpoint 
> 
>
> Key: HDFS-15293
> URL: https://issues.apache.org/jira/browse/HDFS-15293
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
>  Labels: multi-sbnn, release-blocker
> Attachments: HDFS-15293.001.patch
>
>
> HDFS-12979 introduced the logic that, if ANN sees consecutive fs image upload 
> from Standby with a small delta comparing to previous fsImage. ANN would 
> reject this image. This is to avoid overly frequent fsImage in case of when 
> there are multiple Standby node. However this check could be too stringent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15293) Relax the condition for accepting a fsimage when receiving a checkpoint

2020-05-10 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15293:
-
Target Version/s: 3.3.0
  Labels: multi-sbnn release-blocker  (was: multi-sbnn)

> Relax the condition for accepting a fsimage when receiving a checkpoint 
> 
>
> Key: HDFS-15293
> URL: https://issues.apache.org/jira/browse/HDFS-15293
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
>  Labels: multi-sbnn, release-blocker
> Attachments: HDFS-15293.001.patch
>
>
> HDFS-12979 introduced the logic that, if ANN sees consecutive fs image upload 
> from Standby with a small delta comparing to previous fsImage. ANN would 
> reject this image. This is to avoid overly frequent fsImage in case of when 
> there are multiple Standby node. However this check could be too stringent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15340) RBF: Balance data across federation namespaces with DistCp and snapshot diff / Step 1: The State Machine(BalanceProcedureScheduler)

2020-05-10 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104039#comment-17104039
 ] 

Jinglun commented on HDFS-15340:


Hi [~ayushtkn] [~linyiqun], thanks your nice review, comments and suggestions ! 
Upload v03.

 
{quote}BalanceJob.java and BalanceProcedureConfigKeys.java both have private 
empty constructors, give a check if you would need them?
{quote}
The empty constructor of BalanceJob is useless, I'll remove it.   The private 
empty constructor in BalanceProcedureConfigKeys.java is because of checkstyle.  
Something like: tool class must not have public constructor. I'll keep it to 
pass checkstyle.

 
{quote}Don't need I think and I don't find the place that will invoke following 
construct method.
{quote}
The empty constructor for BalanceProcedure, MultiPhaseProcedure, 
RecordProcedure and RetryProcedure a
re used for deserialization. It uses reflection and needs a standard empty 
constructor. The deserialization code is at BalanceJob.readFields().

 

A small reminder is the HDFSJournal is renamed to BalanceJournalInfoHDFS in v03.

> RBF: Balance data across federation namespaces with DistCp and snapshot diff 
> / Step 1: The State Machine(BalanceProcedureScheduler)
> ---
>
> Key: HDFS-15340
> URL: https://issues.apache.org/jira/browse/HDFS-15340
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15340.001.patch, HDFS-15340.002.patch, 
> HDFS-15340.003.patch
>
>
> Patch in HDFS-15294 is too big to review so we split it into 2 patches. This 
> is the first one. Detail can be found at HDFS-15294.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15340) RBF: Balance data across federation namespaces with DistCp and snapshot diff / Step 1: The State Machine(BalanceProcedureScheduler)

2020-05-10 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-15340:
---
Attachment: HDFS-15340.003.patch

> RBF: Balance data across federation namespaces with DistCp and snapshot diff 
> / Step 1: The State Machine(BalanceProcedureScheduler)
> ---
>
> Key: HDFS-15340
> URL: https://issues.apache.org/jira/browse/HDFS-15340
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15340.001.patch, HDFS-15340.002.patch, 
> HDFS-15340.003.patch
>
>
> Patch in HDFS-15294 is too big to review so we split it into 2 patches. This 
> is the first one. Detail can be found at HDFS-15294.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15349) Adapt the netty4 ARM support switch(YARN-9898)

2020-05-10 Thread liusheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liusheng updated HDFS-15349:

Attachment: HDFS-15349.001.patch
Status: Patch Available  (was: Open)

> Adapt the netty4 ARM support switch(YARN-9898)
> --
>
> Key: HDFS-15349
> URL: https://issues.apache.org/jira/browse/HDFS-15349
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: liusheng
>Priority: Major
> Attachments: HDFS-15349.001.patch
>
>
> As the issue YARN-9898 always make the Jenkins unhappy, I have split the HDFS 
> part of that patch to here according to [~ayushsaxena]'s suggestion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15349) Adapt the netty4 ARM support switch(YARN-9898)

2020-05-10 Thread liusheng (Jira)
liusheng created HDFS-15349:
---

 Summary: Adapt the netty4 ARM support switch(YARN-9898)
 Key: HDFS-15349
 URL: https://issues.apache.org/jira/browse/HDFS-15349
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: liusheng


As the issue YARN-9898 always make the Jenkins unhappy, I have split the HDFS 
part of that patch to here according to [~ayushsaxena]'s suggestion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15243) Child directory should not be deleted or renamed if parent directory is a protected directory

2020-05-10 Thread liuyanyu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104004#comment-17104004
 ] 

liuyanyu commented on HDFS-15243:
-

Thanks [~ayushtkn] for reviewing, has updated the patch according to your 
suggestions

> Child directory should not be deleted or renamed if parent directory is a 
> protected directory
> -
>
> Key: HDFS-15243
> URL: https://issues.apache.org/jira/browse/HDFS-15243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: 3.1.1
>Affects Versions: 3.1.1
>Reporter: liuyanyu
>Assignee: liuyanyu
>Priority: Major
> Attachments: HDFS-15243.001.patch, HDFS-15243.002.patch, 
> HDFS-15243.003.patch, HDFS-15243.004.patch, HDFS-15243.005.patch, 
> HDFS-15243.006.patch, image-2020-03-28-09-23-31-335.png
>
>
> HDFS-8983 add  fs.protected.directories to support protected directories on 
> NameNode.  But as I test, when set a parent directory(eg /testA)  to 
> protected directory, the child directory (eg /testA/testB) still can be 
> deleted or renamed. When we protect a directory  mainly for protecting the 
> data under this directory , So I think the child directory should not be 
> delete or renamed if the parent directory is a protected directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15243) Child directory should not be deleted or renamed if parent directory is a protected directory

2020-05-10 Thread liuyanyu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuyanyu updated HDFS-15243:

Attachment: HDFS-15243.006.patch

> Child directory should not be deleted or renamed if parent directory is a 
> protected directory
> -
>
> Key: HDFS-15243
> URL: https://issues.apache.org/jira/browse/HDFS-15243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: 3.1.1
>Affects Versions: 3.1.1
>Reporter: liuyanyu
>Assignee: liuyanyu
>Priority: Major
> Attachments: HDFS-15243.001.patch, HDFS-15243.002.patch, 
> HDFS-15243.003.patch, HDFS-15243.004.patch, HDFS-15243.005.patch, 
> HDFS-15243.006.patch, image-2020-03-28-09-23-31-335.png
>
>
> HDFS-8983 add  fs.protected.directories to support protected directories on 
> NameNode.  But as I test, when set a parent directory(eg /testA)  to 
> protected directory, the child directory (eg /testA/testB) still can be 
> deleted or renamed. When we protect a directory  mainly for protecting the 
> data under this directory , So I think the child directory should not be 
> delete or renamed if the parent directory is a protected directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15348) [SBN Read] IllegalStateException happened when doing failover

2020-05-10 Thread xuzq (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuzq updated HDFS-15348:

Description: 
Standby shutdown when doing failover, and throw IllegalStateException.

_getJournaledEdits_ only return _dfs.ha.tail-edits.qjm.rpc.max-txns_ edits, 
resulting in failure to replay all edits in _catchupDuringFailover_.

 

And check _streams.isEmpty()_ will be throw this exception in 
_FSEditLog#openForWrite_

The exception like:

 
{code:java}
2020-05-10 09:20:02,235 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode 
IPC Server handler 763 on 8022: Error encountered requiring NN sh
utdown. Shutting down immediately.
java.lang.IllegalStateException: Cannot start writing at txid 173922195318 when 
there is a stream available for read: org.apache.hadoop.hdfs.se
rver.namenode.RedundantEditLogInputStream@47b73995
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:320)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1352)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1890)
at 
org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
at 
org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1763)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1605){code}
 

  was:
Standby shutdown when doing failover, and throw IllegalStateException.

`getJournaledEdits` only return `dfs.ha.tail-edits.qjm.rpc.max-txns` edits, 
resulting in failure to replay all edits in `catchupDuringFailover()`.

 

And check `streams.isEmpty()` will be throw this exception in 
`FSEditLog#openForWrite`

The exception like:

 
{code:java}
2020-05-10 09:20:02,235 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode 
IPC Server handler 763 on 8022: Error encountered requiring NN sh
utdown. Shutting down immediately.
java.lang.IllegalStateException: Cannot start writing at txid 173922195318 when 
there is a stream available for read: org.apache.hadoop.hdfs.se
rver.namenode.RedundantEditLogInputStream@47b73995
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:320)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1352)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1890)
at 
org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
at 
org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1763)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1605){code}
 


> [SBN Read] IllegalStateException happened when doing failover
> -
>
> Key: HDFS-15348
> URL: https://issues.apache.org/jira/browse/HDFS-15348
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: xuzq
>Priority: Major
>
> Standby shutdown when doing failover, and throw IllegalStateException.
> _getJournaledEdits_ only return _dfs.ha.tail-edits.qjm.rpc.max-txns_ edits, 
> resulting in failure to replay all edits in _catchupDuringFailover_.
>  
> And check _streams.isEmpty()_ will be throw this exception in 
> _FSEditLog#openForWrite_
> The exception like:
>  
> {code:java}
> 2020-05-10 09:20:02,235 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode 
> IPC Server handler 763 on 8022: Error encountered requiring NN sh
> utdown. Shutting down immediately.
> java.lang.IllegalStateException: Cannot start writing at txid 173922195318 
> when there is a stream available for read: org.apache.hadoop.hdfs.se
> rver.namenode.RedundantEditLogInputStream@47b73995
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:320)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1352)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1890)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
> at 
> 

[jira] [Commented] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-05-10 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103856#comment-17103856
 ] 

Lisheng Sun commented on HDFS-15255:


 
{quote}
2. There was a conflict due to HDFS-14283 in TestDFSInputStream. You 
contributed that change - could you check my change there looks fine and if so 
I think this one is good to commit now.
{quote}

hi [~sodonnell] i have checked it and it looks good.  i think we could commit 
the v10 patch to trunk.  Thank you.

> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15255-findbugs-test.001.patch, 
> HDFS-15255.001.patch, HDFS-15255.002.patch, HDFS-15255.003.patch, 
> HDFS-15255.004.patch, HDFS-15255.005.patch, HDFS-15255.006.patch, 
> HDFS-15255.007.patch, HDFS-15255.008.patch, HDFS-15255.009.patch, 
> HDFS-15255.010.patch, experiment-find-bugs.001.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return fast StorageType node when the 
> distance is same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15344) DataNode#checkSuperuserPrivilege should use UGI#getGroups after HADOOP-13442

2020-05-10 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103844#comment-17103844
 ] 

Ayush Saxena commented on HDFS-15344:
-

There is was no patch here, changed the state back to open. Was creating 
confusion...

> DataNode#checkSuperuserPrivilege should use UGI#getGroups after HADOOP-13442
> 
>
> Key: HDFS-15344
> URL: https://issues.apache.org/jira/browse/HDFS-15344
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.5
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
>
> HADOOP-13442 added UGI#getGroups to avoid list->array->list conversions. This 
> ticket is opened to change DataNode#checkSuperuserPrivilege to use 
> UGI#getGroups. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15344) DataNode#checkSuperuserPrivilege should use UGI#getGroups after HADOOP-13442

2020-05-10 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15344:

Status: Open  (was: Patch Available)

> DataNode#checkSuperuserPrivilege should use UGI#getGroups after HADOOP-13442
> 
>
> Key: HDFS-15344
> URL: https://issues.apache.org/jira/browse/HDFS-15344
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.5
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
>
> HADOOP-13442 added UGI#getGroups to avoid list->array->list conversions. This 
> ticket is opened to change DataNode#checkSuperuserPrivilege to use 
> UGI#getGroups. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation

2020-05-10 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103843#comment-17103843
 ] 

Lisheng Sun commented on HDFS-12288:


hi [~elgoiri]

i understands this patch resolved the issues that is DataNode's xceiver count 
rough calculation.  [~shahrs87] is intended to solve the problem.

> Fix DataNode's xceiver count calculation
> 
>
> Key: HDFS-12288
> URL: https://issues.apache.org/jira/browse/HDFS-12288
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Lukas Majercak
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch, 
> HDFS-12288.003.patch, HDFS-12288.004.patch, HDFS-12288.005.patch, 
> HDFS-12288.006.patch, HDFS-12288.007.patch, HDFS-12288.008.patch
>
>
> The problem with the ThreadGroup.activeCount() method is that the method is 
> only a very rough estimate, and in reality returns the total number of 
> threads in the thread group as opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the 
> actual number of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN 
> for choosing replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value 
> which only accounts for actual number of DataXcevier threads currently 
> running and thus represents the load on the DN much better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15340) RBF: Balance data across federation namespaces with DistCp and snapshot diff / Step 1: The State Machine(BalanceProcedureScheduler)

2020-05-10 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103828#comment-17103828
 ] 

Yiqun Lin commented on HDFS-15340:
--

Haven't taken the deep review, but some initial review comments for the 
readable of this patch:

*BalanceJob.java*
{noformat}
+ public static final Logger LOG =
+ LoggerFactory.getLogger(BalanceJob.class.getName());
{noformat}
{{BalanceJob.class.getName()}} can be simplified to {{BalanceJob.class}}
{noformat}
private BalanceJob() {}
{noformat}
I don't think this is necessary since we already define {{private 
BalanceJob(Iterable procedures, boolean remove)}}.

In BalanceJob#toString, can we add missed comma character in string builder 
when doing the new append operation?

*BalanceProcedure.java*
{noformat}
public static final Logger LOG =
 + LoggerFactory.getLogger(BalanceProcedure.class.getName());
{noformat}
BalanceProcedure.class.getName() --> BalanceProcedure.class

It will look better if we could add some necessary comments for these variables.
{noformat}
+  private String nextProcedure;
+  private String name;
+  private long delayDuration;
+  private BalanceJob job;

{noformat}
Don't need I think and I don't find he place that will invoke following 
construct method.
{noformat}
public BalanceProcedure() {  }
{noformat}
*BalanceProcedureScheduler.java*
{noformat}
+  private ConcurrentHashMap jobSet;
+  private LinkedBlockingQueue runningQueue;
+  private DelayQueue delayQueue;
+  private LinkedBlockingQueue recoverQueue;
+  private Configuration conf;
+  private BalanceJournal journal;
+
+  private Thread reader;
+  private ThreadPoolExecutor workersPool;
+  private Thread rooster;
+  private Thread recoverThread;
+  private AtomicBoolean running = new AtomicBoolean(true);
{noformat}
Two suggestions for above lines:
 * Can we add some necessary comments for these variables?
 * Can we unified the naming pattern of thread names, like reader -> 
readerThread, rooster -> roosterThread?

*HDFSJournal.java*
{noformat}
+/**
+ * Journal based on HDFS.
+ */
+public class HDFSJournal implements BalanceJournal {
{noformat}
The class HDFSJournal looks a little confused, can we rename this to a more 
readable name like BalanceJournalInfo?

In addition, please add more description for this java doc description.

Can we print some tracking log in saveJob/recoverJob/listAllJobs/clear methods? 
It will let users known detailed operation of Balancer job journal operations.

*MultiPhaseProcedure.java*
 * LOG.info("phase {}", currentPhase); --> LOG.info("Current phase {}", 
currentPhase);
 * Not needed: {{public MultiPhaseProcedure() {}}}

*RecordProcedure.java*
 Not needed: {{public RecordProcedure() {}}}

*RetryProcedure.java*
 Not needed: {{public RetryProcedure() {}}}

I will give my further detailed review comments soon.

> RBF: Balance data across federation namespaces with DistCp and snapshot diff 
> / Step 1: The State Machine(BalanceProcedureScheduler)
> ---
>
> Key: HDFS-15340
> URL: https://issues.apache.org/jira/browse/HDFS-15340
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15340.001.patch, HDFS-15340.002.patch
>
>
> Patch in HDFS-15294 is too big to review so we split it into 2 patches. This 
> is the first one. Detail can be found at HDFS-15294.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15311) [SBN Read] High frequency reQueue cause Reader's performance to degrade

2020-05-10 Thread xuzq (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103802#comment-17103802
 ] 

xuzq commented on HDFS-15311:
-

Thanks [~shv] and i will watch 
[HDFS-15291|https://issues.apache.org/jira/browse/HDFS-15291].

In my test, the Observer Handler only slept for a short period of time, and the 
ProcessTime of the client dropped significantly.

 

> [SBN Read] High frequency reQueue cause Reader's performance to degrade
> ---
>
> Key: HDFS-15311
> URL: https://issues.apache.org/jira/browse/HDFS-15311
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: xuzq
>Priority: Major
>
> If _autoMsyncPeriodMs_ is 0, will do _msync_ for each read rpc.
> On the observer server side, it will cause high frequency reQueue in Handler.
> As the Queue is BlockingQueue, so it will cause Readers(small number)  and 
> Handlers(large number) competing for BlockingQueue locks.
> It will cause the throughput decrease.
>  
> Maybe we can let the handler sleep a little time to wait the StateId to 
> decrease ReQueue.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15348) [SBN Read] IllegalStateException happened when doing failover

2020-05-10 Thread xuzq (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103796#comment-17103796
 ] 

xuzq commented on HDFS-15348:
-

Maybe we can turn off the `dfs.ha.tail-edits.in-progress` when Standby transfer 
to Active.

And turn on when Active transfer to Standby.

> [SBN Read] IllegalStateException happened when doing failover
> -
>
> Key: HDFS-15348
> URL: https://issues.apache.org/jira/browse/HDFS-15348
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: xuzq
>Priority: Major
>
> Standby shutdown when doing failover, and throw IllegalStateException.
> `getJournaledEdits` only return `dfs.ha.tail-edits.qjm.rpc.max-txns` edits, 
> resulting in failure to replay all edits in `catchupDuringFailover()`.
>  
> And check `streams.isEmpty()` will be throw this exception in 
> `FSEditLog#openForWrite`
> The exception like:
>  
> {code:java}
> 2020-05-10 09:20:02,235 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode 
> IPC Server handler 763 on 8022: Error encountered requiring NN sh
> utdown. Shutting down immediately.
> java.lang.IllegalStateException: Cannot start writing at txid 173922195318 
> when there is a stream available for read: org.apache.hadoop.hdfs.se
> rver.namenode.RedundantEditLogInputStream@47b73995
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:320)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1352)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1890)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1763)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1605){code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15348) [SBN Read] IllegalStateException happened when doing failover

2020-05-10 Thread xuzq (Jira)
xuzq created HDFS-15348:
---

 Summary: [SBN Read] IllegalStateException happened when doing 
failover
 Key: HDFS-15348
 URL: https://issues.apache.org/jira/browse/HDFS-15348
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: xuzq


Standby shutdown when doing failover, and throw IllegalStateException.

`getJournaledEdits` only return `dfs.ha.tail-edits.qjm.rpc.max-txns` edits, 
resulting in failure to replay all edits in `catchupDuringFailover()`.

 

And check `streams.isEmpty()` will be throw this exception in 
`FSEditLog#openForWrite`

The exception like:

 
{code:java}
2020-05-10 09:20:02,235 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode 
IPC Server handler 763 on 8022: Error encountered requiring NN sh
utdown. Shutting down immediately.
java.lang.IllegalStateException: Cannot start writing at txid 173922195318 when 
there is a stream available for read: org.apache.hadoop.hdfs.se
rver.namenode.RedundantEditLogInputStream@47b73995
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:320)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1352)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1890)
at 
org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
at 
org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1763)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1605){code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15300) RBF: updateActiveNamenode() is invalid when RPC address is IP

2020-05-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103743#comment-17103743
 ] 

Hadoop QA commented on HDFS-15300:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 47s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
19s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 19m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 11m  1s{color} 
| {color:red} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
59s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 0s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}144m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.io.compress.snappy.TestSnappyCompressorDecompressor |
|   | hadoop.io.compress.TestCompressorDecompressor |
|   | hadoop.ha.TestZKFailoverControllerStress |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29260/artifact/out/Dockerfile
 |
| JIRA Issue | HDFS-15300 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13002519/HDFS-15300-002.patch |
| Optional Tests | dupname asflicense compile javac javadoc 

[jira] [Updated] (HDFS-15300) RBF: updateActiveNamenode() is invalid when RPC address is IP

2020-05-10 Thread xuzq (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuzq updated HDFS-15300:

Attachment: HDFS-15300-002.patch

> RBF: updateActiveNamenode() is invalid when RPC address is IP
> -
>
> Key: HDFS-15300
> URL: https://issues.apache.org/jira/browse/HDFS-15300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: xuzq
>Assignee: xuzq
>Priority: Major
> Attachments: HDFS-15300-001.patch, HDFS-15300-002.patch
>
>
> ActiveNamenodeResolver#updateActiveNamenode will invalid when the rpc address 
> like ip:port.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15300) RBF: updateActiveNamenode() is invalid when RPC address is IP

2020-05-10 Thread xuzq (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103681#comment-17103681
 ] 

xuzq commented on HDFS-15300:
-

Thanks [~ayushtkn] [~elgoiri], please review [^HDFS-15300-002.patch]

> RBF: updateActiveNamenode() is invalid when RPC address is IP
> -
>
> Key: HDFS-15300
> URL: https://issues.apache.org/jira/browse/HDFS-15300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: xuzq
>Assignee: xuzq
>Priority: Major
> Attachments: HDFS-15300-001.patch, HDFS-15300-002.patch
>
>
> ActiveNamenodeResolver#updateActiveNamenode will invalid when the rpc address 
> like ip:port.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15250) Setting `dfs.client.use.datanode.hostname` to true can crash the system because of unhandled UnresolvedAddressException

2020-05-10 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103674#comment-17103674
 ] 

Hudson commented on HDFS-15250:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18231 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18231/])
HDFS-15250. Setting `dfs.client.use.datanode.hostname` to true can crash 
(ayushsaxena: rev aab9e0b16ecc8fa00228c00c7ab90e55195cf5f4)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java


> Setting `dfs.client.use.datanode.hostname` to true can crash the system 
> because of unhandled UnresolvedAddressException
> ---
>
> Key: HDFS-15250
> URL: https://issues.apache.org/jira/browse/HDFS-15250
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ctest
>Assignee: Ctest
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HDFS-15250-001.patch, HDFS-15250-002.patch
>
>
> *Problem:*
> `dfs.client.use.datanode.hostname` by default is set to false, which means 
> the client will use the IP address of the datanode to connect to the 
> datanode, rather than the hostname of the datanode.
> In `org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer`:
>  
> {code:java}
>  try {
>    Peer peer = remotePeerFactory.newConnectedPeer(inetSocketAddress, token,
>    datanode);
>    LOG.trace("nextTcpPeer: created newConnectedPeer {}", peer);
>    return new BlockReaderPeer(peer, false);
>  } catch (IOException e) {
>    LOG.trace("nextTcpPeer: failed to create newConnectedPeer connected to"
>    + "{}", datanode);
>    throw e;
>  }
> {code}
>  
> If `dfs.client.use.datanode.hostname` is false, then it will try to connect 
> via IP address. If the IP address is illegal and the connection fails, 
> IOException will be thrown from `newConnectedPeer` and be handled.
> If `dfs.client.use.datanode.hostname` is true, then it will try to connect 
> via hostname. If the hostname cannot be resolved, UnresolvedAddressException 
> will be thrown from `newConnectedPeer`. However, UnresolvedAddressException 
> is not a subclass of IOException so `nextTcpPeer` doesn’t handle this 
> exception at all. This unhandled exception could crash the system.
>  
> *Solution:*
> Since the method is handling the illegal IP address, then the illegal 
> hostname should be also handled as well. One solution is to add the handling 
> logic in `nextTcpPeer`:
> {code:java}
>  } catch (IOException e) {
>    LOG.trace("nextTcpPeer: failed to create newConnectedPeer connected to"
>    + "{}", datanode);
>    throw e;
>  } catch (UnresolvedAddressException e) {
>    ... // handling logic 
>  }{code}
> I am very happy to provide a patch to do this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15250) Setting `dfs.client.use.datanode.hostname` to true can crash the system because of unhandled UnresolvedAddressException

2020-05-10 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103673#comment-17103673
 ] 

Ayush Saxena commented on HDFS-15250:
-

Committed to trunk, branch-3.3,3.2 and 3.1. Thanx [~ctest.team] for the 
contribution.

> Setting `dfs.client.use.datanode.hostname` to true can crash the system 
> because of unhandled UnresolvedAddressException
> ---
>
> Key: HDFS-15250
> URL: https://issues.apache.org/jira/browse/HDFS-15250
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ctest
>Assignee: Ctest
>Priority: Major
> Attachments: HDFS-15250-001.patch, HDFS-15250-002.patch
>
>
> *Problem:*
> `dfs.client.use.datanode.hostname` by default is set to false, which means 
> the client will use the IP address of the datanode to connect to the 
> datanode, rather than the hostname of the datanode.
> In `org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer`:
>  
> {code:java}
>  try {
>    Peer peer = remotePeerFactory.newConnectedPeer(inetSocketAddress, token,
>    datanode);
>    LOG.trace("nextTcpPeer: created newConnectedPeer {}", peer);
>    return new BlockReaderPeer(peer, false);
>  } catch (IOException e) {
>    LOG.trace("nextTcpPeer: failed to create newConnectedPeer connected to"
>    + "{}", datanode);
>    throw e;
>  }
> {code}
>  
> If `dfs.client.use.datanode.hostname` is false, then it will try to connect 
> via IP address. If the IP address is illegal and the connection fails, 
> IOException will be thrown from `newConnectedPeer` and be handled.
> If `dfs.client.use.datanode.hostname` is true, then it will try to connect 
> via hostname. If the hostname cannot be resolved, UnresolvedAddressException 
> will be thrown from `newConnectedPeer`. However, UnresolvedAddressException 
> is not a subclass of IOException so `nextTcpPeer` doesn’t handle this 
> exception at all. This unhandled exception could crash the system.
>  
> *Solution:*
> Since the method is handling the illegal IP address, then the illegal 
> hostname should be also handled as well. One solution is to add the handling 
> logic in `nextTcpPeer`:
> {code:java}
>  } catch (IOException e) {
>    LOG.trace("nextTcpPeer: failed to create newConnectedPeer connected to"
>    + "{}", datanode);
>    throw e;
>  } catch (UnresolvedAddressException e) {
>    ... // handling logic 
>  }{code}
> I am very happy to provide a patch to do this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15250) Setting `dfs.client.use.datanode.hostname` to true can crash the system because of unhandled UnresolvedAddressException

2020-05-10 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15250:

Fix Version/s: 3.1.5
   3.4.0
   3.3.1
   3.2.2
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Setting `dfs.client.use.datanode.hostname` to true can crash the system 
> because of unhandled UnresolvedAddressException
> ---
>
> Key: HDFS-15250
> URL: https://issues.apache.org/jira/browse/HDFS-15250
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ctest
>Assignee: Ctest
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HDFS-15250-001.patch, HDFS-15250-002.patch
>
>
> *Problem:*
> `dfs.client.use.datanode.hostname` by default is set to false, which means 
> the client will use the IP address of the datanode to connect to the 
> datanode, rather than the hostname of the datanode.
> In `org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer`:
>  
> {code:java}
>  try {
>    Peer peer = remotePeerFactory.newConnectedPeer(inetSocketAddress, token,
>    datanode);
>    LOG.trace("nextTcpPeer: created newConnectedPeer {}", peer);
>    return new BlockReaderPeer(peer, false);
>  } catch (IOException e) {
>    LOG.trace("nextTcpPeer: failed to create newConnectedPeer connected to"
>    + "{}", datanode);
>    throw e;
>  }
> {code}
>  
> If `dfs.client.use.datanode.hostname` is false, then it will try to connect 
> via IP address. If the IP address is illegal and the connection fails, 
> IOException will be thrown from `newConnectedPeer` and be handled.
> If `dfs.client.use.datanode.hostname` is true, then it will try to connect 
> via hostname. If the hostname cannot be resolved, UnresolvedAddressException 
> will be thrown from `newConnectedPeer`. However, UnresolvedAddressException 
> is not a subclass of IOException so `nextTcpPeer` doesn’t handle this 
> exception at all. This unhandled exception could crash the system.
>  
> *Solution:*
> Since the method is handling the illegal IP address, then the illegal 
> hostname should be also handled as well. One solution is to add the handling 
> logic in `nextTcpPeer`:
> {code:java}
>  } catch (IOException e) {
>    LOG.trace("nextTcpPeer: failed to create newConnectedPeer connected to"
>    + "{}", datanode);
>    throw e;
>  } catch (UnresolvedAddressException e) {
>    ... // handling logic 
>  }{code}
> I am very happy to provide a patch to do this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org