[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2020-04-16 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085175#comment-17085175
 ] 

Jonathan Hung commented on HDFS-15036:
--

Pushed to branch-2.10.

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch, 
> HDFS-15036.003.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-17 Thread Chen Liang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998561#comment-16998561
 ] 

Chen Liang commented on HDFS-15036:
---

[~Jim_Brennan] I filed https://issues.apache.org/jira/browse/INFRA-19581, but 
haven't got update from Infra folks yet.

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch, 
> HDFS-15036.003.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-17 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998526#comment-16998526
 ] 

Jim Brennan commented on HDFS-15036:


[~shv], [~jhung] was branch-2 actually deleted?    I can still see it, and this 
commit is still there.

 

 

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch, 
> HDFS-15036.003.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-13 Thread Chen Liang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996000#comment-16996000
 ] 

Chen Liang commented on HDFS-15036:
---

Oops! Did not realize it's already deleted, guess I missed the messages... will 
work on deleting it again...

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch, 
> HDFS-15036.003.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-13 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995976#comment-16995976
 ] 

Konstantin Shvachko commented on HDFS-15036:


[~vagarychen] we should commit to branch-2.10. branch-2 was deleted as per 
discussion on hdfs-dev.

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch, 
> HDFS-15036.003.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-12 Thread Chen Liang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995038#comment-16995038
 ] 

Chen Liang commented on HDFS-15036:
---

Thanks [~shv]! I've committed to trunk and branch-2.

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch, 
> HDFS-15036.003.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-12 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994950#comment-16994950
 ] 

Hudson commented on HDFS-15036:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17758 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17758/])
HDFS-15036. Active NameNode should not silently fail the image transfer. 
(cliang: rev 65c4660bcd897e139fc175ca438cff75ec0c6be8)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRollingUpgrade.java


> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch, 
> HDFS-15036.003.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-11 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993972#comment-16993972
 ] 

Konstantin Shvachko commented on HDFS-15036:


+1 on v03 patch.
TestFsck failure is tracked under HDFS-15038.
And the checkstyle warning is bogus.

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch, 
> HDFS-15036.003.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-11 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993325#comment-16993325
 ] 

Hadoop QA commented on HDFS-15036:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
49s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 1s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
49s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 41s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 88 unchanged - 0 fixed = 89 total (was 88) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}107m 35s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}180m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestFsck |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15036 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12988486/HDFS-15036.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 32b29ff6bfad 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c2e9783 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28499/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28499/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 

[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-10 Thread Chen Liang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993224#comment-16993224
 ] 

Chen Liang commented on HDFS-15036:
---

Thanks for the review [~shv], uploaded v03 patch

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch, 
> HDFS-15036.003.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-10 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993150#comment-16993150
 ] 

Konstantin Shvachko commented on HDFS-15036:


Looks good. Minor things
# Typo in {{doCheckpoint()}}. Removed -is- in:
{code}// by the other node. This could happen if{code}
# Should use parameterized logging
{code}LOG.info("Image upload rejected by the other NameNode: {}", 
uploadResult);{code}


> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993133#comment-16993133
 ] 

Hadoop QA commented on HDFS-15036:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
34s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 88 unchanged - 0 fixed = 89 total (was 88) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 50s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 17s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}169m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.qjournal.client.TestQuorumJournalManager |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
|   | hadoop.hdfs.TestFileAppend2 |
|   | hadoop.hdfs.server.namenode.TestFsck |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.qjournal.client.TestQJMWithFaults |
|   | hadoop.hdfs.TestWriteRead |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15036 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12988468/HDFS-15036.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 21686e70fb56 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 875a3e9 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| findbugs | 

[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-10 Thread Chen Liang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993033#comment-16993033
 ] 

Chen Liang commented on HDFS-15036:
---

Thanks for taking a look [~shv]! Post v002 patch. And the failed tests all 
passed in my local run.

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15036.001.patch, HDFS-15036.002.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-09 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992155#comment-16992155
 ] 

Hadoop QA commented on HDFS-15036:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
47s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
17s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 20s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}162m 55s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.server.namenode.TestFsck |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15036 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12988378/HDFS-15036.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 0e77d17e1e66 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / dc66de7 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28488/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| unit | 

[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-09 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992138#comment-16992138
 ] 

Konstantin Shvachko commented on HDFS-15036:


Good investigation and findings [~vagarychen].
# Could you add a comment explaining that {{ImageServlet}} should not reject 
images other than checkpoints.
# I am still concerned about the "silent" part. Should we add some logging, so 
that next time we could see what happened on both nodes.

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15036.001.patch
>
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-09 Thread Chen Liang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992029#comment-16992029
 ] 

Chen Liang commented on HDFS-15036:
---

[~csun] np, sure, thanks for asking :) . Assigning to myself then.

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-09 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992018#comment-16992018
 ] 

Chao Sun commented on HDFS-15036:
-

[~vagarychen] sorry for grabbing this JIRA too soon :) Since you have done much 
study on this, do you want to take this JIRA instead?

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-09 Thread Chen Liang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991998#comment-16991998
 ] 

Chen Liang commented on HDFS-15036:
---

Spent some time debugging this issue, I think I found the cause of the issue. 

In HDFS-12979, we introduced a logic that, if a image being uploaded is not too 
far ahead of the previous image, this image upload request is rejected. This is 
to prevent the scenario when there are multiple SbNs, all SbNs upload images to 
ANN too frequently. This is considered as correct behavior, so there is no 
logging indication of any error or anything here (the being "silent" part). 
Both ANN and SbN simply ignore and proceed.

But now it appears that, a side effect of this change, is that during RU, the 
rollback image also has to go through this check, and it could also be 
rejected. If this happens, SbN proceeds assuming upload is done, while ANN 
proceeds with still not receiving the rollback image. The upload silently 
failed in this case.

The check logic that rejects the upload is in {{ImageServlet}}. In my earlier 
test, I just commented out the whole block below and the issue seems gone. But 
I think the fix is probably just adding a new check to ensure this rejection 
only applies to regular image upload, like the newly added line in the line in 
the follow code snippet. But I haven't actually tested changing it this way.:
{code}
  if (checkRecentImageEnable &&
  NameNodeFile.IMAGE.equals(parsedParams.getNameNodeFile()) && 
// <--- this should fix the issue
  timeDelta < checkpointPeriod &&
  txid - lastCheckpointTxid < checkpointTxnCount) {
// only when at least one of two conditions are met we accept
// a new fsImage
// 1. most recent image's txid is too far behind
// 2. last checkpoint time was too old
response.sendError(HttpServletResponse.SC_CONFLICT,
"Most recent checkpoint is neither too far behind in "
+ "txid, nor too old. New txnid cnt is "
+ (txid - lastCheckpointTxid)
+ ", expecting at least " + checkpointTxnCount
+ " unless too long since last upload.");
return null;
  }
{code}


> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-06 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990155#comment-16990155
 ] 

Konstantin Shvachko commented on HDFS-15036:


This can happen during checkpointing or preparing for a rolling upgrade.
 We observed it during rolling upgrade, when Standby was reporting: _"Rollback 
image has been created. Proceed to upgrade daemons."_ While Active still 
reported _" Rollback image has not been created."_

In the logs for ANN I see that it started receiving the image:
{code:java}
 
2019-12-05 23:14:56,328 INFO 
org.apache.hadoop.hdfs.server.namenode.ImageServlet: ImageServlet allowing 
checkpointer: hdfs/active.namenode.com 
{code}
But ANN did not print anything related to the image transfer afterwards. And 
the transferred image is missing in its storage directory.
 The ANN log message comes from {{isValidRequestor()}} called by 
{{ImageServlet.doPut()}}.

SBN log indicates that the image was fully and successfully transferred to ANN
{code:java}
 
2019-12-05 23:22:29,526 INFO 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Sending fileName: 
/hdfs-storage-dir/current/fsimage_rollback_00773999609, fileSize: 
1889021016. Sent total: 1889021016 bytes. Size of last segment intended to 
send: -1 bytes.
{code}
The SBN log message comes from {{TransferFsImage.copyFileToStream()}}.

Looking at the code in {{ImageServlet.doPut()}} I see that in one of the 
methods it calls {{Util.receiveFile()}} if an Exception is thrown inside the 
while-loop performing reading from the input (socket) stream and writing to the 
output (image file) stream, then it will go through a series of finalized 
sections without catching the exception and logging it or reporting the error 
to the sender.

We should:
 # Catch and log any exceptions occurring there
 # Notify SBN about the error, so that it could retry the transfer

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org