[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-12-03 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987610#comment-16987610
 ] 

Ayush Saxena commented on HDFS-15023:
-

Thanx [~ferhui], overall Looks Good.

{code:java}
276 System.setIn(inOriginial);

{code}

This probably in the test should be moved to finally block, else if the test 
fails in middle, stream won't get reset.

[~vinayakumarb] is it fine with you. Anything you would like to add...

> [SBN read] ZKFC should check the state before joining the election
> --
>
> Key: HDFS-15023
> URL: https://issues.apache.org/jira/browse/HDFS-15023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15023.001.patch, HDFS-15023.002.patch
>
>
> As discussed HDFS-14961, ZKFC should not join election when its state is 
> observer.
> Right now when namemode was an observer,  it joined election and it would be 
> become a standby.
> MonitorDaemon thread callchain is that
> doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() 
> -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> 
> createLockNodeAsync
> callBack for zookeeper
> processResult -> becomeStandby



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-12-03 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987586#comment-16987586
 ] 

Jinglun commented on HDFS-13811:


Hi [~linyiqun], thanks your comments ! I rollback the changes except the 
annotation. Upload v07.

> RBF: Race condition between router admin quota update and periodic quota 
> update service
> ---
>
> Key: HDFS-13811
> URL: https://issues.apache.org/jira/browse/HDFS-13811
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, 
> HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch, 
> HDFS-13811.004.patch, HDFS-13811.005.patch, HDFS-13811.006.patch, 
> HDFS-13811.007.patch
>
>
> If we try to update quota of an existing mount entry and at the same time 
> periodic quota update service is running on the same mount entry, it is 
> leading the mount table to _inconsistent state._
> Here transactions are:
> A - Quota update service is fetching mount table entries.
> B - Quota update service is updating the mount table with current usage.
> A' - User is trying to update quota using admin cmd.
> and the transaction sequence is [ A A' B ]
> quota update service is updating the mount table with old quota value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-12-03 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-13811:
---
Attachment: HDFS-13811.007.patch

> RBF: Race condition between router admin quota update and periodic quota 
> update service
> ---
>
> Key: HDFS-13811
> URL: https://issues.apache.org/jira/browse/HDFS-13811
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, 
> HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch, 
> HDFS-13811.004.patch, HDFS-13811.005.patch, HDFS-13811.006.patch, 
> HDFS-13811.007.patch
>
>
> If we try to update quota of an existing mount entry and at the same time 
> periodic quota update service is running on the same mount entry, it is 
> leading the mount table to _inconsistent state._
> Here transactions are:
> A - Quota update service is fetching mount table entries.
> B - Quota update service is updating the mount table with current usage.
> A' - User is trying to update quota using admin cmd.
> and the transaction sequence is [ A A' B ]
> quota update service is updating the mount table with old quota value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-12-03 Thread Fei Hui (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HDFS-15023:
---
Attachment: HDFS-15023.002.patch

> [SBN read] ZKFC should check the state before joining the election
> --
>
> Key: HDFS-15023
> URL: https://issues.apache.org/jira/browse/HDFS-15023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15023.001.patch, HDFS-15023.002.patch
>
>
> As discussed HDFS-14961, ZKFC should not join election when its state is 
> observer.
> Right now when namemode was an observer,  it joined election and it would be 
> become a standby.
> MonitorDaemon thread callchain is that
> doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() 
> -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> 
> createLockNodeAsync
> callBack for zookeeper
> processResult -> becomeStandby



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-12-03 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987578#comment-16987578
 ] 

Fei Hui commented on HDFS-15023:


[~ayushtkn] Thanks for your comments. I had a mistake. I shouldn't change the 
fix mentioned on HDFS-14961
Upload v002 patch with UT 

> [SBN read] ZKFC should check the state before joining the election
> --
>
> Key: HDFS-15023
> URL: https://issues.apache.org/jira/browse/HDFS-15023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15023.001.patch
>
>
> As discussed HDFS-14961, ZKFC should not join election when its state is 
> observer.
> Right now when namemode was an observer,  it joined election and it would be 
> become a standby.
> MonitorDaemon thread callchain is that
> doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() 
> -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> 
> createLockNodeAsync
> callBack for zookeeper
> processResult -> becomeStandby



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14990) HDFS: No symbolic icon to represent decommissioning state of datanode in Name node WEB UI

2019-12-03 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987568#comment-16987568
 ] 

Xieming Li commented on HDFS-14990:
---

[~ayushtkn]

Thank you for your feedback.

I will just ignore this ticket for the moment.

> HDFS: No symbolic icon to represent decommissioning state of datanode in Name 
> node WEB UI
> -
>
> Key: HDFS-14990
> URL: https://issues.apache.org/jira/browse/HDFS-14990
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, ui
>Affects Versions: 3.2.1
>Reporter: Souryakanta Dwivedy
>Assignee: Xieming Li
>Priority: Minor
> Attachments: image-2019-11-15-17-31-23-213.png, 
> image-2019-11-16-02-09-10-545.png
>
>
> No symbolic icon to represent decommissioning state of datanode in Name node 
> WEB UI
> Expected output:-
>        Like other datanode states as In-service , Down , Decommissioned etc. 
> an icon should also be added for decommissioning state
>  
>                    !image-2019-11-15-17-31-23-213.png!
>                



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-03 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987567#comment-16987567
 ] 

Xieming Li commented on HDFS-14983:
---

I have uploaded a patch that works, and I hope to get some feedback.

I performed a very simple test on my environment.
 In core-site.xml I added:
{code:java}
-  
-  
{code}
{code:java}
$ export HADOOP_PROXY_USER=dummyuser
$ hdfs dfs -ls
ls: User: sri@DEV is not allowed to impersonate dummyuser
$
$ sudo hdfs dfsrouteradmin -refreshSuperUserGroupsConfiguration
Successfully updated super user groups configuration on router 0.0.0.0:8111
$
$ hdfs dfs -ls
{code}
If everything looks okay, I will keep adding UnitTest, JavaDoc, Documentation.

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14990) HDFS: No symbolic icon to represent decommissioning state of datanode in Name node WEB UI

2019-12-03 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987566#comment-16987566
 ] 

Ayush Saxena commented on HDFS-14990:
-

The symbol is there for me in the UI too

> HDFS: No symbolic icon to represent decommissioning state of datanode in Name 
> node WEB UI
> -
>
> Key: HDFS-14990
> URL: https://issues.apache.org/jira/browse/HDFS-14990
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, ui
>Affects Versions: 3.2.1
>Reporter: Souryakanta Dwivedy
>Assignee: Xieming Li
>Priority: Minor
> Attachments: image-2019-11-15-17-31-23-213.png, 
> image-2019-11-16-02-09-10-545.png
>
>
> No symbolic icon to represent decommissioning state of datanode in Name node 
> WEB UI
> Expected output:-
>        Like other datanode states as In-service , Down , Decommissioned etc. 
> an icon should also be added for decommissioning state
>  
>                    !image-2019-11-15-17-31-23-213.png!
>                



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-03 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14983:
--
Attachment: HDFS-14983.draft.001.patch
Status: Patch Available  (was: In Progress)

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14990) HDFS: No symbolic icon to represent decommissioning state of datanode in Name node WEB UI

2019-12-03 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14990:
--
Status: Open  (was: Patch Available)

> HDFS: No symbolic icon to represent decommissioning state of datanode in Name 
> node WEB UI
> -
>
> Key: HDFS-14990
> URL: https://issues.apache.org/jira/browse/HDFS-14990
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, ui
>Affects Versions: 3.2.1
>Reporter: Souryakanta Dwivedy
>Assignee: Xieming Li
>Priority: Minor
> Attachments: image-2019-11-15-17-31-23-213.png, 
> image-2019-11-16-02-09-10-545.png
>
>
> No symbolic icon to represent decommissioning state of datanode in Name node 
> WEB UI
> Expected output:-
>        Like other datanode states as In-service , Down , Decommissioned etc. 
> an icon should also be added for decommissioning state
>  
>                    !image-2019-11-15-17-31-23-213.png!
>                



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14990) HDFS: No symbolic icon to represent decommissioning state of datanode in Name node WEB UI

2019-12-03 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14990:
--
Attachment: (was: HDFS-14893.draft.001.patch)

> HDFS: No symbolic icon to represent decommissioning state of datanode in Name 
> node WEB UI
> -
>
> Key: HDFS-14990
> URL: https://issues.apache.org/jira/browse/HDFS-14990
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, ui
>Affects Versions: 3.2.1
>Reporter: Souryakanta Dwivedy
>Assignee: Xieming Li
>Priority: Minor
> Attachments: image-2019-11-15-17-31-23-213.png, 
> image-2019-11-16-02-09-10-545.png
>
>
> No symbolic icon to represent decommissioning state of datanode in Name node 
> WEB UI
> Expected output:-
>        Like other datanode states as In-service , Down , Decommissioned etc. 
> an icon should also be added for decommissioning state
>  
>                    !image-2019-11-15-17-31-23-213.png!
>                



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted

2019-12-03 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987561#comment-16987561
 ] 

Hadoop QA commented on HDFS-15031:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 36s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 26 unchanged - 0 fixed = 27 total (was 26) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 19s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}111m 34s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}168m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer |
|   | hadoop.hdfs.TestSafeModeWithStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.TestDecommission |
|   | hadoop.hdfs.TestFileChecksumCompositeCrc |
|   | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks |
|   | hadoop.hdfs.TestEncryptedTransfer |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestDFSPermission |
|   | hadoop.hdfs.client.impl.TestBlockReaderLocal |
|   | hadoop.hdfs.server.diskbalancer.TestDiskBalancer |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987418/HDFS-15031.000.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux aa3f5e79d42e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Updated] (HDFS-14990) HDFS: No symbolic icon to represent decommissioning state of datanode in Name node WEB UI

2019-12-03 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14990:
--
Attachment: HDFS-14893.draft.001.patch
Status: Patch Available  (was: In Progress)

> HDFS: No symbolic icon to represent decommissioning state of datanode in Name 
> node WEB UI
> -
>
> Key: HDFS-14990
> URL: https://issues.apache.org/jira/browse/HDFS-14990
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, ui
>Affects Versions: 3.2.1
>Reporter: Souryakanta Dwivedy
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14893.draft.001.patch, 
> image-2019-11-15-17-31-23-213.png, image-2019-11-16-02-09-10-545.png
>
>
> No symbolic icon to represent decommissioning state of datanode in Name node 
> WEB UI
> Expected output:-
>        Like other datanode states as In-service , Down , Decommissioned etc. 
> an icon should also be added for decommissioning state
>  
>                    !image-2019-11-15-17-31-23-213.png!
>                



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14990) HDFS: No symbolic icon to represent decommissioning state of datanode in Name node WEB UI

2019-12-03 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987560#comment-16987560
 ] 

Xieming Li commented on HDFS-14990:
---

Ping.
Any thoughts on this ticket?

> HDFS: No symbolic icon to represent decommissioning state of datanode in Name 
> node WEB UI
> -
>
> Key: HDFS-14990
> URL: https://issues.apache.org/jira/browse/HDFS-14990
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, ui
>Affects Versions: 3.2.1
>Reporter: Souryakanta Dwivedy
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14893.draft.001.patch, 
> image-2019-11-15-17-31-23-213.png, image-2019-11-16-02-09-10-545.png
>
>
> No symbolic icon to represent decommissioning state of datanode in Name node 
> WEB UI
> Expected output:-
>        Like other datanode states as In-service , Down , Decommissioned etc. 
> an icon should also be added for decommissioning state
>  
>                    !image-2019-11-15-17-31-23-213.png!
>                



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15027) Correct target DN's log while balancing.

2019-12-03 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987559#comment-16987559
 ] 

Hadoop QA commented on HDFS-15027:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
59s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 45s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | hadoop.hdfs.TestEncryptionZonesWithKMS |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15027 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987421/HDFS-15027.000.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 074700de44c3 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 54e7605 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28442/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28442/testReport/ |
| Max. process+thread count | 2928 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 

[jira] [Commented] (HDFS-14997) BPServiceActor process command from NameNode asynchronously

2019-12-03 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987530#comment-16987530
 ] 

Xiaoqiao He commented on HDFS-14997:


Ping [~elgoiri],[~sodonnell],[~weichiu] any furthermore comments here?

> BPServiceActor process command from NameNode asynchronously
> ---
>
> Key: HDFS-14997
> URL: https://issues.apache.org/jira/browse/HDFS-14997
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-14997.001.patch, HDFS-14997.002.patch, 
> HDFS-14997.003.patch, HDFS-14997.004.patch, HDFS-14997.005.patch
>
>
> There are two core functions, report(#sendHeartbeat, #blockReport, 
> #cacheReport) and #processCommand in #BPServiceActor main process flow. If 
> processCommand cost long time it will block send report flow. Meanwhile 
> processCommand could cost long time(over 1000s the worst case I meet) when IO 
> load  of DataNode is very high. Since some IO operations are under 
> #datasetLock, So it has to wait to acquire #datasetLock long time when 
> process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat 
> will not send to NameNode in-time, and trigger other disasters.
> I propose to improve #processCommand asynchronously and not block 
> #BPServiceActor to send heartbeat back to NameNode when meet high IO load.
> Notes:
> 1. Lifeline could be one effective solution, however some old branches are 
> not support this feature.
> 2. IO operations under #datasetLock is another issue, I think we should solve 
> it at another JIRA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15010) BlockPoolSlice#addReplicaThreadPool static pool should be initialized by static method

2019-12-03 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987532#comment-16987532
 ] 

Surendra Singh Lilhore commented on HDFS-15010:
---

[~yuvaldeg], 

Sure, I will do it. Need to create new patch, give me some time.

> BlockPoolSlice#addReplicaThreadPool static pool should be initialized by 
> static method
> --
>
> Key: HDFS-15010
> URL: https://issues.apache.org/jira/browse/HDFS-15010
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.1.2
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15010.001.patch, HDFS-15010.02.patch, 
> HDFS-15010.03.patch, HDFS-15010.04.patch, HDFS-15010.05.patch
>
>
> {{BlockPoolSlice#initializeAddReplicaPool()}} method currently initialize the 
> static thread pool instance. But when two {{BPServiceActor}} actor try to 
> load block pool parallelly then it may create different instance. 
> So {{BlockPoolSlice#initializeAddReplicaPool()}} method should be a static 
> method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14519) NameQuota is not update after concat operation, so namequota is wrong

2019-12-03 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987529#comment-16987529
 ] 

Ayush Saxena commented on HDFS-14519:
-

Will try backport in couple of days.

> NameQuota is not update after concat operation, so namequota is wrong
> -
>
> Key: HDFS-14519
> URL: https://issues.apache.org/jira/browse/HDFS-14519
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ranith Sardar
>Assignee: Ranith Sardar
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-14519.001.patch, HDFS-14519.002.patch, 
> HDFS-14519.003.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15009) FSCK "-list-corruptfileblocks" return Invalid Entries

2019-12-03 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987528#comment-16987528
 ] 

Ayush Saxena commented on HDFS-15009:
-

[~hemanthboyina] can you reopen and provide patches for lower branches.

> FSCK "-list-corruptfileblocks" return Invalid Entries
> -
>
> Key: HDFS-15009
> URL: https://issues.apache.org/jira/browse/HDFS-15009
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-15009.001.patch, HDFS-15009.002.patch, 
> HDFS-15009.003.patch, HDFS-15009.004.patch
>
>
> Scenario :  if we have two directories dir1, dir10 and only dir10 have 
> corrupt files 
> Now if we run -list-corruptfileblocks for dir1,  corrupt files count for dir1 
> showing is of dir10
> {code:java}
>   while (blkIterator.hasNext()) {
> BlockInfo blk = blkIterator.next();
> final INodeFile inode = getBlockCollection(blk);
> skip++;
> if (inode != null) {
>   String src = inode.getFullPathName();
>   if (src.startsWith(path)){
> corruptFiles.add(new CorruptFileBlockInfo(src, blk));
> count++;
> if (count >= DEFAULT_MAX_CORRUPT_FILEBLOCKS_RETURNED)
>   break;
>   }
> }
>   } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-12-03 Thread huhaiyang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986882#comment-16986882
 ] 

huhaiyang edited comment on HDFS-15024 at 12/4/19 4:17 AM:
---

[~xkrogen] [~csun] [~vagarychen] Thanks for your comments!
I understand Normally, if we only set 2 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2


Currently,
nn1 is in active state
nn2 is in standby state

when the client connects to nn2, it needs to retry, and will quickly connect to 
nn1. However, when the nn1 fails to connect due to network problems, the next 
time( the third time), the sleep timeout will be performed for a period of time 
to retry

Current HDFS-6440 Support more than 2 NameNodes.
if we set 3 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2,nn3

nn1 is in active state
nn2 is in standby state
nn3 is in standby state(or observer state)

when the client connects to nn2, it needs to retry, and will quickly connect to 
nn3. 
and  the client connects to nn3, it needs to retry, and will quickly connect to 
nn1.
However, when the nn1 fails to connect due to network problems, the next time( 
the fourth time), the sleep timeout will be performed for a period of time to 
retry.

That is to say, it is necessary to connect all the configured NN nodes once.
If no NN nodes the requirements are found, required to perform sleep and 
retry...

In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that 
the Number of NameNodes as a condition of calculation of sleep time is more 
reasonable((which is current v01 patch)).

Pls let me know whether it is correct. Thanks


was (Author: haiyang hu):
[~xkrogen] [~csun] [~vagarychen] Thanks for your comments!
I understand Normally, if we only set 2 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2


Currently,
nn1 is in active state
nn2 is in standby state

when the client connects to nn2, it needs to retry, and will quickly connect to 
nn1. However, when the nn1 fails to connect due to network problems, the next 
time( the third time), the sleep timeout will be performed for a period of time 
to retry

Current HDFS-6440 Support more than 2 NameNodes.
if we set 3 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2,nn3

nn1 is in active state
nn2 is in standby state
nn3 is in standby state(or observer state)

when the client connects to nn2, it needs to retry, and will quickly connect to 
nn3. 
and  the client connects to nn3, it needs to retry, and will quickly connect to 
nn1.
However, when the nn1 fails to connect due to network problems, the next time( 
the fourth time), the sleep timeout will be performed for a period of time to 
retry.

That is to say, it is necessary to connect all the configured NN nodes once.
If no NN nodes the requirements are found, required to perform sleep and 
retry...

In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that 
the Number of NameNodes as a condition of calculation of sleep time is more 
reasonable((which is current v01 patch)).


> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -mkdir /user/haiyang1/test8
> You need to request nn1 when you execute the msync method,
> Actually connect nn2 first and failover is required
> In connection nn3 does not meet the requirements, failover needs to be 
> performed, but at this time, failover operation needs to be performed during 
> a period of hibernation
> Finally, it took a period of hibernation to connect the successful request to 
> nn1
> In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
> default implementation is Sleep time is calculated when more than one 
> failover operation is performed
> I think that the Number of NameNodes as a condition of calculation of sleep 
> time is more reasonable
> That is, in the current test, executing failover on connection nn3 does not 
> need to sleep time to directly connect to the next nn node
> See client_error.log for details



--
This message was sent by Atlassian Jira

[jira] [Commented] (HDFS-14546) Document block placement policies

2019-12-03 Thread Amithsha (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987504#comment-16987504
 ] 

Amithsha commented on HDFS-14546:
-

[~weichiu] Yes will remove the PR from jira. Have updated the git but no 
response so removing it.

> Document block placement policies
> -
>
> Key: HDFS-14546
> URL: https://issues.apache.org/jira/browse/HDFS-14546
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Amithsha
>Priority: Major
>  Labels: documentation
> Attachments: HDFS-14546-01.patch, HDFS-14546-02.patch, 
> HDFS-14546-03.patch, HDFS-14546-04.patch, HDFS-14546-05.patch, 
> HDFS-14546-06.patch, HDFS-14546-07.patch, HdfsDesign.patch
>
>
> Currently, all the documentation refers to the default block placement policy.
> However, over time there have been new policies:
> * BlockPlacementPolicyRackFaultTolerant (HDFS-7891)
> * BlockPlacementPolicyWithNodeGroup (HDFS-3601)
> * BlockPlacementPolicyWithUpgradeDomain (HDFS-9006)
> We should update the documentation to refer to them explaining their 
> particularities and probably how to setup each one of them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-12-03 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987503#comment-16987503
 ] 

Yiqun Lin commented on HDFS-13811:
--

[~LiJinglun], following change of the unit test can also rollback. Would you 
mind rollbacking this? Others looks good to me.

{noformat}
 
+  /**
+   * Test {@link RouterQuotaUpdateService#periodicInvoke()} updates quota usage
+   * in RouterQuotaManager.
+   */
   @Test
   public void testQuotaUpdating() throws Exception {
 long nsQuota = 30;
@@ -498,15 +504,14 @@ public void testQuotaUpdating() throws Exception {
 .spaceQuota(ssQuota).build());
 addMountTable(mountTable);
 
-// Call periodicInvoke to ensure quota  updated in quota manager
-// and state store.
-RouterQuotaUpdateService updateService = routerContext.getRouter()
-.getQuotaCacheUpdateService();
+// Call periodicInvoke to ensure quota  updated in quota manager.
+Router router = routerContext.getRouter();
+RouterQuotaUpdateService updateService =
+router.getQuotaCacheUpdateService();
 updateService.periodicInvoke();
 
 // verify initial quota value
-MountTable updatedMountTable = getMountTable(path);
-RouterQuotaUsage quota = updatedMountTable.getQuota();
+RouterQuotaUsage quota = router.getQuotaManager().getQuotaUsage(path);
 assertEquals(nsQuota, quota.getQuota());
 assertEquals(ssQuota, quota.getSpaceQuota());
 assertEquals(1, quota.getFileAndDirectoryCount());
@@ -520,17 +525,16 @@ public void testQuotaUpdating() throws Exception {
 appendData(path + "/file", routerClient, BLOCK_SIZE);
 
 updateService.periodicInvoke();
-updatedMountTable = getMountTable(path);
-quota = updatedMountTable.getQuota();
+quota = router.getQuotaManager().getQuotaUsage(path);
 
-// verify if quota has been updated in state store
+// verify if quota usage has been updated in RouterQuotaManager.
 assertEquals(nsQuota, quota.getQuota());
 assertEquals(ssQuota, quota.getSpaceQuota());
 assertEquals(3, quota.getFileAndDirectoryCount());
 assertEquals(BLOCK_SIZE, quota.getSpaceConsumed());
 
 // verify quota sync on adding new destination to mount entry.
-updatedMountTable = getMountTable(path);
+MountTable updatedMountTable = getMountTable(path);
{noformat}
 

> RBF: Race condition between router admin quota update and periodic quota 
> update service
> ---
>
> Key: HDFS-13811
> URL: https://issues.apache.org/jira/browse/HDFS-13811
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, 
> HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch, 
> HDFS-13811.004.patch, HDFS-13811.005.patch, HDFS-13811.006.patch
>
>
> If we try to update quota of an existing mount entry and at the same time 
> periodic quota update service is running on the same mount entry, it is 
> leading the mount table to _inconsistent state._
> Here transactions are:
> A - Quota update service is fetching mount table entries.
> B - Quota update service is updating the mount table with current usage.
> A' - User is trying to update quota using admin cmd.
> and the transaction sequence is [ A A' B ]
> quota update service is updating the mount table with old quota value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15005) Backport HDFS-12300 to branch-2

2019-12-03 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987494#comment-16987494
 ] 

Hadoop QA commented on HDFS-15005:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 
49s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 5s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 9 new + 235 unchanged - 1 fixed = 244 total (was 236) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 51s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}123m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:f555aa740b5 |
| JIRA Issue | HDFS-15005 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987413/HDFS-15005-branch-2.002.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b4127fb9aeac 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / 0ac6dc7 |
| maven | version: Apache Maven 3.3.9 |

[jira] [Commented] (HDFS-14825) [Dynamometer] Workload doesn't start unless an absolute path of Mapper class given

2019-12-03 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987462#comment-16987462
 ] 

Hudson commented on HDFS-14825:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17716 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17716/])
HDFS-14825. [Dynamometer] Workload doesn't start unless an absolute path 
(aajisaka: rev 54e760511a2e2f8e5ecf1eb8762434fcd041f4d6)
* (edit) 
hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-workload/src/main/java/org/apache/hadoop/tools/dynamometer/workloadgenerator/WorkloadDriver.java


> [Dynamometer] Workload doesn't start unless an absolute path of Mapper class 
> given
> --
>
> Key: HDFS-14825
> URL: https://issues.apache.org/jira/browse/HDFS-14825
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Soya Miyoshi
>Assignee: Takanobu Asanuma
>Priority: Major
> Fix For: 3.3.0
>
>
> When starting a workload by start-workload.sh, unless an absolute path of 
> Mapper is given, the workload doesn't start.
>  
> {code:java}
> $ hadoop/tools/dynamometer/dynamometer-workload/bin/start-workload.sh - \
> Dauditreplay.input-path=hdfs:///user/souya/input/audit  \
> -Dauditreplay.output-path=hdfs:///user/souya/results/ \
> -Dauditreplay.num-threads=50 -Dauditreplay.log-start-time.ms=5 \
> -nn_uri hdfs://namenode_address:port/ \
> -mapper_class_name AuditReplayMapper
> {code}
> results in
> {code:java}
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
> details.
> Exception in thread "main" java.lang.ClassNotFoundException: Class 
> org.apache.hadoop.tools.dynamometer.workloadgenerator.AuditReplayMapper not 
> found
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2572)
>   at 
> org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.getMapperClass(WorkloadDriver.java:183)
>   at 
> org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.run(WorkloadDriver.java:127)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.main(WorkloadDriver.java:172)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15027) Correct target DN's log while balancing.

2019-12-03 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987459#comment-16987459
 ] 

Xudong Cao commented on HDFS-15027:
---

cc [~weichiu] Sorry, patch uploaded again, this is just a minor log improve, I 
think there's no need for unit test.

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
> /192.168.202.11:9866, initiated by /192.168.202.13:44502, 
> delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15027) Correct target DN's log while balancing.

2019-12-03 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987459#comment-16987459
 ] 

Xudong Cao edited comment on HDFS-15027 at 12/4/19 2:40 AM:


cc [~weichiu] Sorry, patch uploaded again, this is just a minor log 
improvement, I think there's no need for unit test.


was (Author: xudongcao):
cc [~weichiu] Sorry, patch uploaded again, this is just a minor log improve, I 
think there's no need for unit test.

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
> /192.168.202.11:9866, initiated by /192.168.202.13:44502, 
> delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-03 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Attachment: HDFS-15027.000.patch

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
> /192.168.202.11:9866, initiated by /192.168.202.13:44502, 
> delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-03 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is somehow misleading, maybe we can improve the pattern like:

*Copied BLOCK from PROXY DN, initiated by* *BALANCER*

 

An example log of target DN during balancing:

1. Wrong log printing before jira:
{code:java}
2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
2. Correct log printing after jira:
{code:java}
2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
/192.168.202.11:9866, initiated by /192.168.202.13:44502, 
delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}

  was:
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is somehow misleading, maybe we can improve the pattern like:

*Copied BLOCK from PROXY DN, initiated by* *BALANCER*

 

An example log of target DN during balancing:
 # Wrong log printing before jira:

{code:java}
2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}

 # Correct log printing after jira:

{code:java}
2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
/192.168.202.11:9866, initiated by /192.168.202.13:44502, 
delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
> /192.168.202.11:9866, initiated by /192.168.202.13:44502, 
> delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-03 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is somehow misleading, maybe we can improve the pattern like:

*Copied BLOCK from PROXY DN, initiated by* *BALANCER*

 

An example log of target DN during balancing:
 # Wrong log printing before jira:

{code:java}
2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}

 # Correct log printing after jira:

{code:java}
2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
/192.168.202.11:9866, initiated by /192.168.202.13:44502, 
delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}

  was:
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is somehow misleading, maybe we can improve the pattern like:

*Copied BLOCK from PROXY DN, initiated by* *BALANCER*


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
>  # Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
>  # Correct log printing after jira:
> {code:java}
> 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
> /192.168.202.11:9866, initiated by /192.168.202.13:44502, 
> delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130

2019-12-03 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987452#comment-16987452
 ] 

Hadoop QA commented on HDFS-14998:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
47s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
35m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14998 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987417/HDFS-14998.004.patch |
| Optional Tests |  dupname  asflicense  mvnsite  |
| uname | Linux 9bf47831d059 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / f1ab7f1 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 309 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28441/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Update Observer Namenode doc for ZKFC after HDFS-14130
> --
>
> Key: HDFS-14998
> URL: https://issues.apache.org/jira/browse/HDFS-14998
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, 
> HDFS-14998.003.patch, HDFS-14998.004.patch
>
>
> After HDFS-14130, we should update observer namenode doc, observer namenode 
> can run with ZKFC running



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-14825) [Dynamometer] Workload doesn't start unless an absolute path of Mapper class given

2019-12-03 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-14825.
--
Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Merged into trunk. Thank you [~tasanuma], [~soyamiyoshi], and [~xkrogen].

> [Dynamometer] Workload doesn't start unless an absolute path of Mapper class 
> given
> --
>
> Key: HDFS-14825
> URL: https://issues.apache.org/jira/browse/HDFS-14825
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Soya Miyoshi
>Assignee: Takanobu Asanuma
>Priority: Major
> Fix For: 3.3.0
>
>
> When starting a workload by start-workload.sh, unless an absolute path of 
> Mapper is given, the workload doesn't start.
>  
> {code:java}
> $ hadoop/tools/dynamometer/dynamometer-workload/bin/start-workload.sh - \
> Dauditreplay.input-path=hdfs:///user/souya/input/audit  \
> -Dauditreplay.output-path=hdfs:///user/souya/results/ \
> -Dauditreplay.num-threads=50 -Dauditreplay.log-start-time.ms=5 \
> -nn_uri hdfs://namenode_address:port/ \
> -mapper_class_name AuditReplayMapper
> {code}
> results in
> {code:java}
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
> details.
> Exception in thread "main" java.lang.ClassNotFoundException: Class 
> org.apache.hadoop.tools.dynamometer.workloadgenerator.AuditReplayMapper not 
> found
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2572)
>   at 
> org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.getMapperClass(WorkloadDriver.java:183)
>   at 
> org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.run(WorkloadDriver.java:127)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.main(WorkloadDriver.java:172)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14980) diskbalancer query command always tries to contact to port 9867

2019-12-03 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDFS-14980:


Assignee: Aravindan Vijayan  (was: Siddharth Wagle)

> diskbalancer query command always tries to contact to port 9867
> ---
>
> Key: HDFS-14980
> URL: https://issues.apache.org/jira/browse/HDFS-14980
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: diskbalancer
>Reporter: Nilotpal Nandi
>Assignee: Aravindan Vijayan
>Priority: Major
>
> disbalancer query commands always tries to connect to port 9867 even when 
> datanode IPC port is different.
> In this setup , datanode IPC port is set to 20001.
>  
> diskbalancer report command works fine and connects to IPC port 20001
>  
> {noformat}
> hdfs diskbalancer -report -node 172.27.131.193
> 19/11/12 08:58:55 INFO command.Command: Processing report command
> 19/11/12 08:58:57 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 19/11/12 08:58:57 INFO block.BlockTokenSecretManager: Setting block keys
> 19/11/12 08:58:57 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 19/11/12 08:58:58 INFO command.Command: Reporting volume information for 
> DataNode(s). These DataNode(s) are parsed from '172.27.131.193'.
> Processing report command
> Reporting volume information for DataNode(s). These DataNode(s) are parsed 
> from '172.27.131.193'.
> [172.27.131.193:20001] - : 3 
> volumes with node data density 0.05.
> [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK1/] - 0.15 used: 
> 39343871181/259692498944, 0.85 free: 220348627763/259692498944, isFailed: 
> False, isReadOnly: False, isSkip: False, isTransient: False.
> [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK2/] - 0.15 used: 
> 39371179986/259692498944, 0.85 free: 220321318958/259692498944, isFailed: 
> False, isReadOnly: False, isSkip: False, isTransient: False.
> [DISK: volume-/dataroot/ycloud/dfs/dn/] - 0.19 used: 
> 49934903670/259692498944, 0.81 free: 209757595274/259692498944, isFailed: 
> False, isReadOnly: False, isSkip: False, isTransient: False.
>  
> {noformat}
>  
> But  diskbalancer query command fails and tries to connect to port 9867 
> (default port).
>  
> {noformat}
> hdfs diskbalancer -query 172.27.131.193
> 19/11/12 06:37:15 INFO command.Command: Executing "query plan" command.
> 19/11/12 06:37:16 INFO ipc.Client: Retrying connect to server: 
> /172.27.131.193:9867. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19/11/12 06:37:17 INFO ipc.Client: Retrying connect to server: 
> /172.27.131.193:9867. Already tried 1 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> ..
> ..
> ..
> 19/11/12 06:37:25 ERROR tools.DiskBalancerCLI: Exception thrown while running 
> DiskBalancerCLI.
> {noformat}
>  
>  
> Expectation :
> diskbalancer query command should work fine without explicitly mentioning 
> datanode IPC port address



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14980) diskbalancer query command always tries to contact to port 9867

2019-12-03 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDFS-14980:


Assignee: Siddharth Wagle  (was: Aravindan Vijayan)

> diskbalancer query command always tries to contact to port 9867
> ---
>
> Key: HDFS-14980
> URL: https://issues.apache.org/jira/browse/HDFS-14980
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: diskbalancer
>Reporter: Nilotpal Nandi
>Assignee: Siddharth Wagle
>Priority: Major
>
> disbalancer query commands always tries to connect to port 9867 even when 
> datanode IPC port is different.
> In this setup , datanode IPC port is set to 20001.
>  
> diskbalancer report command works fine and connects to IPC port 20001
>  
> {noformat}
> hdfs diskbalancer -report -node 172.27.131.193
> 19/11/12 08:58:55 INFO command.Command: Processing report command
> 19/11/12 08:58:57 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 19/11/12 08:58:57 INFO block.BlockTokenSecretManager: Setting block keys
> 19/11/12 08:58:57 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 19/11/12 08:58:58 INFO command.Command: Reporting volume information for 
> DataNode(s). These DataNode(s) are parsed from '172.27.131.193'.
> Processing report command
> Reporting volume information for DataNode(s). These DataNode(s) are parsed 
> from '172.27.131.193'.
> [172.27.131.193:20001] - : 3 
> volumes with node data density 0.05.
> [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK1/] - 0.15 used: 
> 39343871181/259692498944, 0.85 free: 220348627763/259692498944, isFailed: 
> False, isReadOnly: False, isSkip: False, isTransient: False.
> [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK2/] - 0.15 used: 
> 39371179986/259692498944, 0.85 free: 220321318958/259692498944, isFailed: 
> False, isReadOnly: False, isSkip: False, isTransient: False.
> [DISK: volume-/dataroot/ycloud/dfs/dn/] - 0.19 used: 
> 49934903670/259692498944, 0.81 free: 209757595274/259692498944, isFailed: 
> False, isReadOnly: False, isSkip: False, isTransient: False.
>  
> {noformat}
>  
> But  diskbalancer query command fails and tries to connect to port 9867 
> (default port).
>  
> {noformat}
> hdfs diskbalancer -query 172.27.131.193
> 19/11/12 06:37:15 INFO command.Command: Executing "query plan" command.
> 19/11/12 06:37:16 INFO ipc.Client: Retrying connect to server: 
> /172.27.131.193:9867. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19/11/12 06:37:17 INFO ipc.Client: Retrying connect to server: 
> /172.27.131.193:9867. Already tried 1 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> ..
> ..
> ..
> 19/11/12 06:37:25 ERROR tools.DiskBalancerCLI: Exception thrown while running 
> DiskBalancerCLI.
> {noformat}
>  
>  
> Expectation :
> diskbalancer query command should work fine without explicitly mentioning 
> datanode IPC port address



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted

2019-12-03 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-15031:
---
Status: Patch Available  (was: Open)

> Allow BootstrapStandby to download FSImage if the directory is already 
> formatted
> 
>
> Key: HDFS-15031
> URL: https://issues.apache.org/jira/browse/HDFS-15031
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Minor
> Attachments: HDFS-15031.000.patch
>
>
> Currently, BootstrapStandby will only download the latest FSImage if it has 
> formatted the local image directory. This can be an issue when there are out 
> of date FSImages on a Standby NameNode, as the non-interactive mode will not 
> format the image directory, and BootstrapStandby will return an error code. 
> The changes here simply allow BootstrapStandby to download the latest FSImage 
> to the image directory, without needing to format first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted

2019-12-03 Thread Danny Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Becker updated HDFS-15031:

Attachment: HDFS-15031.000.patch

> Allow BootstrapStandby to download FSImage if the directory is already 
> formatted
> 
>
> Key: HDFS-15031
> URL: https://issues.apache.org/jira/browse/HDFS-15031
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Minor
> Attachments: HDFS-15031.000.patch
>
>
> Currently, BootstrapStandby will only download the latest FSImage if it has 
> formatted the local image directory. This can be an issue when there are out 
> of date FSImages on a Standby NameNode, as the non-interactive mode will not 
> format the image directory, and BootstrapStandby will return an error code. 
> The changes here simply allow BootstrapStandby to download the latest FSImage 
> to the image directory, without needing to format first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted

2019-12-03 Thread Danny Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Becker reassigned HDFS-15031:
---

Assignee: Danny Becker

> Allow BootstrapStandby to download FSImage if the directory is already 
> formatted
> 
>
> Key: HDFS-15031
> URL: https://issues.apache.org/jira/browse/HDFS-15031
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Minor
>
> Currently, BootstrapStandby will only download the latest FSImage if it has 
> formatted the local image directory. This can be an issue when there are out 
> of date FSImages on a Standby NameNode, as the non-interactive mode will not 
> format the image directory, and BootstrapStandby will return an error code. 
> The changes here simply allow BootstrapStandby to download the latest FSImage 
> to the image directory, without needing to format first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted

2019-12-03 Thread Danny Becker (Jira)
Danny Becker created HDFS-15031:
---

 Summary: Allow BootstrapStandby to download FSImage if the 
directory is already formatted
 Key: HDFS-15031
 URL: https://issues.apache.org/jira/browse/HDFS-15031
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs, namenode
Reporter: Danny Becker


Currently, BootstrapStandby will only download the latest FSImage if it has 
formatted the local image directory. This can be an issue when there are out of 
date FSImages on a Standby NameNode, as the non-interactive mode will not 
format the image directory, and BootstrapStandby will return an error code. The 
changes here simply allow BootstrapStandby to download the latest FSImage to 
the image directory, without needing to format first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130

2019-12-03 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987416#comment-16987416
 ] 

Fei Hui commented on HDFS-14998:


[~ayushtkn][~csun][~shv] Thanks for your comments.
Upload v005 patch.
After HDFS-14130, we state that users can run ZKFC on observer namenode and 
zkfc will participate in the election of Active until the namenode is 
transitioned to standby state.

> Update Observer Namenode doc for ZKFC after HDFS-14130
> --
>
> Key: HDFS-14998
> URL: https://issues.apache.org/jira/browse/HDFS-14998
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, 
> HDFS-14998.003.patch, HDFS-14998.004.patch
>
>
> After HDFS-14130, we should update observer namenode doc, observer namenode 
> can run with ZKFC running



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130

2019-12-03 Thread Fei Hui (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HDFS-14998:
---
Attachment: HDFS-14998.004.patch

> Update Observer Namenode doc for ZKFC after HDFS-14130
> --
>
> Key: HDFS-14998
> URL: https://issues.apache.org/jira/browse/HDFS-14998
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, 
> HDFS-14998.003.patch, HDFS-14998.004.patch
>
>
> After HDFS-14130, we should update observer namenode doc, observer namenode 
> can run with ZKFC running



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-12-03 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987411#comment-16987411
 ] 

Xudong Cao commented on HDFS-14963:
---

cc [~weichiu] All review comments have been resolved, could someone please 
merge this patch? thus we can proceed to process HDFS-14969 (as they modified 
some same files).

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15005) Backport HDFS-12300 to branch-2

2019-12-03 Thread Chao Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-15005:

Attachment: HDFS-15005-branch-2.002.patch

> Backport HDFS-12300 to branch-2
> ---
>
> Key: HDFS-15005
> URL: https://issues.apache.org/jira/browse/HDFS-15005
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-15005-branch-2.000.patch, 
> HDFS-15005-branch-2.001.patch, HDFS-15005-branch-2.002.patch
>
>
> Having DT related information is very useful in audit log. This tracks effort 
> to backport HDFS-12300 to branch-2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15005) Backport HDFS-12300 to branch-2

2019-12-03 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987346#comment-16987346
 ] 

Chao Sun commented on HDFS-15005:
-

Thansk [~weichiu] and sorry missed many unused imports. One question though: 
for the style issue, do you think we should fix the other things than the 
unused imports? I'm thinking whether we should keep it consistent with the 
original patch.

> Backport HDFS-12300 to branch-2
> ---
>
> Key: HDFS-15005
> URL: https://issues.apache.org/jira/browse/HDFS-15005
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-15005-branch-2.000.patch, 
> HDFS-15005-branch-2.001.patch
>
>
> Having DT related information is very useful in audit log. This tracks effort 
> to backport HDFS-12300 to branch-2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130

2019-12-03 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987338#comment-16987338
 ] 

Konstantin Shvachko commented on HDFS-14998:


I see you guys are all on the same page here, just need to formulate it in the 
document.
# We do not require turning off ZKFC on an Observer Node anymore. So we should 
just remove any mentioning of it in the doc.
# ZKFC running on Observer Node does not participate in the election of Active, 
only Standby Nodes do. Should clarify it in the doc.

> Update Observer Namenode doc for ZKFC after HDFS-14130
> --
>
> Key: HDFS-14998
> URL: https://issues.apache.org/jira/browse/HDFS-14998
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, 
> HDFS-14998.003.patch
>
>
> After HDFS-14130, we should update observer namenode doc, observer namenode 
> can run with ZKFC running



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15010) BlockPoolSlice#addReplicaThreadPool static pool should be initialized by static method

2019-12-03 Thread Yuval Degani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987305#comment-16987305
 ] 

Yuval Degani commented on HDFS-15010:
-

[~surendrasingh], would you consider backporting this patch to branch-2?

> BlockPoolSlice#addReplicaThreadPool static pool should be initialized by 
> static method
> --
>
> Key: HDFS-15010
> URL: https://issues.apache.org/jira/browse/HDFS-15010
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.1.2
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15010.001.patch, HDFS-15010.02.patch, 
> HDFS-15010.03.patch, HDFS-15010.04.patch, HDFS-15010.05.patch
>
>
> {{BlockPoolSlice#initializeAddReplicaPool()}} method currently initialize the 
> static thread pool instance. But when two {{BPServiceActor}} actor try to 
> load block pool parallelly then it may create different instance. 
> So {{BlockPoolSlice#initializeAddReplicaPool()}} method should be a static 
> method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15009) FSCK "-list-corruptfileblocks" return Invalid Entries

2019-12-03 Thread Yuval Degani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987303#comment-16987303
 ] 

Yuval Degani commented on HDFS-15009:
-

[~ayushtkn], [~hemanthboyina], would you mind backporting this patch to 
branch-2? seems highly relevant

> FSCK "-list-corruptfileblocks" return Invalid Entries
> -
>
> Key: HDFS-15009
> URL: https://issues.apache.org/jira/browse/HDFS-15009
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-15009.001.patch, HDFS-15009.002.patch, 
> HDFS-15009.003.patch, HDFS-15009.004.patch
>
>
> Scenario :  if we have two directories dir1, dir10 and only dir10 have 
> corrupt files 
> Now if we run -list-corruptfileblocks for dir1,  corrupt files count for dir1 
> showing is of dir10
> {code:java}
>   while (blkIterator.hasNext()) {
> BlockInfo blk = blkIterator.next();
> final INodeFile inode = getBlockCollection(blk);
> skip++;
> if (inode != null) {
>   String src = inode.getFullPathName();
>   if (src.startsWith(path)){
> corruptFiles.add(new CorruptFileBlockInfo(src, blk));
> count++;
> if (count >= DEFAULT_MAX_CORRUPT_FILEBLOCKS_RETURNED)
>   break;
>   }
> }
>   } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14519) NameQuota is not update after concat operation, so namequota is wrong

2019-12-03 Thread Erik Krogen (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987291#comment-16987291
 ] 

Erik Krogen commented on HDFS-14519:


[~RANith] or [~ayushtkn], are you interested in putting together a branch-2 
backport? It seems that the test case needs to be moved.

> NameQuota is not update after concat operation, so namequota is wrong
> -
>
> Key: HDFS-14519
> URL: https://issues.apache.org/jira/browse/HDFS-14519
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ranith Sardar
>Assignee: Ranith Sardar
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-14519.001.patch, HDFS-14519.002.patch, 
> HDFS-14519.003.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15027) Correct target DN's log while balancing.

2019-12-03 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987284#comment-16987284
 ] 

Wei-Chiu Chuang commented on HDFS-15027:


Was the patch file deleted from the attachments?

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15030) Add setQuota operation to HttpFS

2019-12-03 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987280#comment-16987280
 ] 

Wei-Chiu Chuang commented on HDFS-15030:


I thought it was added by HDFS-8631? Looks like the HttpFS part was removed in 
the final patch.

> Add setQuota operation to HttpFS
> 
>
> Key: HDFS-15030
> URL: https://issues.apache.org/jira/browse/HDFS-15030
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
>
> setQuota operation is missing in HttpFS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15022) Add new RPC to transfer data block with external shell script across Datanode

2019-12-03 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987276#comment-16987276
 ] 

Wei-Chiu Chuang commented on HDFS-15022:


This looks like a new feature to me. Can we add documentation about this 
feature, how it is used and so on?

> Add new RPC to transfer data block with external shell script across Datanode
> -
>
> Key: HDFS-15022
> URL: https://issues.apache.org/jira/browse/HDFS-15022
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15022.patch, HDFS-15022.patch
>
>
> Replicating data block is expensive when some Datanodes are down, especially 
> for slow storage. Add a new RPC to replicate block with external shell script 
> across datanode. User can choose more effective way to copy block files.
> In our setup, Archive volume are configured to remote reliable storage. we 
> just add a new link file in new datanode to the remote file when do 
> replication.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15005) Backport HDFS-12300 to branch-2

2019-12-03 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987274#comment-16987274
 ] 

Wei-Chiu Chuang commented on HDFS-15005:


Cool. For my future reference: 
https://dzone.com/articles/string-concatenation-performacne-improvement-in-ja

Can you take care of the checkstyle warnings? Other than that I am +1

> Backport HDFS-12300 to branch-2
> ---
>
> Key: HDFS-15005
> URL: https://issues.apache.org/jira/browse/HDFS-15005
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-15005-branch-2.000.patch, 
> HDFS-15005-branch-2.001.patch
>
>
> Having DT related information is very useful in audit log. This tracks effort 
> to backport HDFS-12300 to branch-2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14546) Document block placement policies

2019-12-03 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987251#comment-16987251
 ] 

Wei-Chiu Chuang commented on HDFS-14546:


[~Amithsha] sounds like you moved to the patch files here in the JIRA? Could 
you abandon the github PR if that's stale?

> Document block placement policies
> -
>
> Key: HDFS-14546
> URL: https://issues.apache.org/jira/browse/HDFS-14546
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Amithsha
>Priority: Major
>  Labels: documentation
> Attachments: HDFS-14546-01.patch, HDFS-14546-02.patch, 
> HDFS-14546-03.patch, HDFS-14546-04.patch, HDFS-14546-05.patch, 
> HDFS-14546-06.patch, HDFS-14546-07.patch, HdfsDesign.patch
>
>
> Currently, all the documentation refers to the default block placement policy.
> However, over time there have been new policies:
> * BlockPlacementPolicyRackFaultTolerant (HDFS-7891)
> * BlockPlacementPolicyWithNodeGroup (HDFS-3601)
> * BlockPlacementPolicyWithUpgradeDomain (HDFS-9006)
> We should update the documentation to refer to them explaining their 
> particularities and probably how to setup each one of them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14957) INodeReference Space Consumed was not same in QuotaUsage and ContentSummary

2019-12-03 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987235#comment-16987235
 ] 

Hadoop QA commented on HDFS-14957:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
3s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 149 unchanged - 1 fixed = 149 total (was 150) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 25s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}100m 
56s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14957 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987386/HDFS-14957.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 403e9a680e72 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0c217fe |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28439/testReport/ |
| Max. process+thread count | 2732 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28439/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> INodeReference Space Consumed was not same in QuotaUsage and ContentSummary
> 

[jira] [Commented] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130

2019-12-03 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987202#comment-16987202
 ] 

Ayush Saxena commented on HDFS-14998:
-

[~ferhui] you need to exactly copy paste the words, if you have understood what 
[~csun] meant to say, you can write in your words, what you feel shall convey 
the meaning to all.
Let me clear it once more, Earlier ZKFC was not allowed to run, on the node 
running Observer Namenode, since ZKFC wasn't aware what about ONN, it used to 
transition it to Standby, so it was mandatory to turn the ZKFC off on Observer 
Namenode, if you want to use the feature.
In the present scenario, ZKFC is aware about Observer Namenode, it doesn't 
bother the Observer Namenode, so you can have ZKFC running on the Observer 
Node, it is now not compulsory to turn it off. But it is not required for a 
person to turn it on for normal working of Observer since Observer can not 
participate in failover, But if a user has a usecase where he keeps on changing 
the state of Observer to Standby back and forth depending on load or some 
condition (Either manually or through scripts), he can keep the ZKFC running so 
as when the Namenode is moved to Standby State it can participate in Automatic 
Failover, With ZKFC being aware of the ONN, this will shed the need of turning 
on and off the ZKFC when the namenode transitions states.

We are just being suggestive, not forcing any user to choose any way, whatever 
his use case may be he can go ahead, What Chao mentioned is one use case where 
ZKFC can be kept running on ONN. If someone doesn't have such a use case, he is 
free to turn the ZKFC off.

You can frame it in your own words, Whatever way you think you can convey the 
user correctly in less words, shall be fine for both of us.
[~csun] correct me, if I have gone wrong in the concept somewhere.

> Update Observer Namenode doc for ZKFC after HDFS-14130
> --
>
> Key: HDFS-14998
> URL: https://issues.apache.org/jira/browse/HDFS-14998
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, 
> HDFS-14998.003.patch
>
>
> After HDFS-14130, we should update observer namenode doc, observer namenode 
> can run with ZKFC running



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15016) RBF: getDatanodeReport() should return the latest update

2019-12-03 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987196#comment-16987196
 ] 

Ayush Saxena commented on HDFS-15016:
-

Not as such preference, I am fine as far as it covers our change.

> RBF: getDatanodeReport() should return the latest update
> 
>
> Key: HDFS-15016
> URL: https://issues.apache.org/jira/browse/HDFS-15016
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15016.000.patch
>
>
> Currently, when the Router calls getDatanodeReport() (or 
> getDatanodeStorageReport()) and the DN is in multiple clusters, it just takes 
> the one that comes first. It should consider the latest update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-12-03 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987194#comment-16987194
 ] 

Íñigo Goiri commented on HDFS-14908:


Thanks for rebasing and checking the issue.
HDFS-15009 seems to be the one doing the change.
[~hemanthboyina] do you mind taking a look?

> LeaseManager should check parent-child relationship when filter open files.
> ---
>
> Key: HDFS-14908
> URL: https://issues.apache.org/jira/browse/HDFS-14908
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, 
> HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, 
> HDFS-14908.006.patch, HDFS-14908.007.patch, HDFS-14908.TestV4.patch, 
> Test.java, TestV2.java, TestV3.java
>
>
> Now when doing listOpenFiles(), LeaseManager only checks whether the filter 
> path is the prefix of the open files. We should check whether the filter path 
> is the parent/ancestor of the open files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14957) INodeReference Space Consumed was not same in QuotaUsage and ContentSummary

2019-12-03 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987189#comment-16987189
 ] 

Ayush Saxena commented on HDFS-14957:
-

bq. for calculating space consumed , we are cosidering file which exists in 
diff(In Snapshot) and also the file's getBlocks() 

Can you link the JIRA, as part of which it was done. Do you find any reason for 
doing so, the people involved there might have some reasons. Just to be sure, 
we can pull them up here.

Anyway ContentSummary and QuotaUsage are different API's and can have different 
behaviors, provided that is justified.

> INodeReference Space Consumed was not same in QuotaUsage and ContentSummary
> ---
>
> Key: HDFS-14957
> URL: https://issues.apache.org/jira/browse/HDFS-14957
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.4
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14957.001.patch, HDFS-14957.JPG
>
>
> for INodeReferences , space consumed was different in QuotaUsage and Content 
> Summary 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15016) RBF: getDatanodeReport() should return the latest update

2019-12-03 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987188#comment-16987188
 ] 

Íñigo Goiri commented on HDFS-15016:


Thanks [~ayushtkn], let me fix the logging.
Any preference for the test?

> RBF: getDatanodeReport() should return the latest update
> 
>
> Key: HDFS-15016
> URL: https://issues.apache.org/jira/browse/HDFS-15016
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15016.000.patch
>
>
> Currently, when the Router calls getDatanodeReport() (or 
> getDatanodeStorageReport()) and the DN is in multiple clusters, it just takes 
> the one that comes first. It should consider the latest update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15012) NN fails to parse Edit logs after applying HDFS-13101

2019-12-03 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987183#comment-16987183
 ] 

Ayush Saxena commented on HDFS-15012:
-

HDFS-13101 is fixing FSImage Corruption, that too seems a critical bug fix, 
Will reverting that not make it vulnerable?
Moreover that has been released too as part of lower versions.
Reverting would solve this problem but would open up the older problem. 
IMO Reverting isn't going to give any big relief, Anyway there is no release 
planned soon, you all can take some more time to get the solution.
If you have any more pointers to how to reproduce the problem, let us know we 
can try help too..

> NN fails to parse Edit logs after applying HDFS-13101
> -
>
> Key: HDFS-15012
> URL: https://issues.apache.org/jira/browse/HDFS-15012
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Eric Lin
>Assignee: Shashikant Banerjee
>Priority: Blocker
>  Labels: release-blocker
>
> After applying HDFS-13101, and deleting and creating large number of 
> snapshots, SNN exited with below error:
>   
> {code:sh}
> 2019-11-18 08:28:06,528 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation DeleteSnapshotOp [snapshotRoot=/path/to/hdfs/file, 
> snapshotName=distcp-3479-31-old, 
> RpcClientId=b16a6cb5-bdbb-45ae-9f9a-f7dc57931f37, Rpc
> CallId=1]
> java.lang.AssertionError: Element already exists: 
> element=partition_isactive=true, DELETED=[partition_isactive=true]
> at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:193)
> at org.apache.hadoop.hdfs.util.Diff.delete(Diff.java:239)
> at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:462)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.initChildren(DirectoryWithSnapshotFeature.java:240)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.iterator(DirectoryWithSnapshotFeature.java:250)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:755)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeReference.cleanSubtree(INodeReference.java:332)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.cleanSubtree(INodeReference.java:583)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:760)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:235)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:259)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:301)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:903)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:756)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:324)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1144)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:796)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> 

[jira] [Commented] (HDFS-15016) RBF: getDatanodeReport() should return the latest update

2019-12-03 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987172#comment-16987172
 ] 

Ayush Saxena commented on HDFS-15016:
-

Thanx [~elgoiri] for the patch. Fix LGTM.
Just FYI. this Log, will not be there if the second time the DN has greater 
last update :

{code:java}
LOG.debug("{} is in multiple subclusters", nodeId);
{code}

Though the DN would have been reported from multiple clusters. If you don't 
have any issues with it. I also fine.
 

> RBF: getDatanodeReport() should return the latest update
> 
>
> Key: HDFS-15016
> URL: https://issues.apache.org/jira/browse/HDFS-15016
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15016.000.patch
>
>
> Currently, when the Router calls getDatanodeReport() (or 
> getDatanodeStorageReport()) and the DN is in multiple clusters, it just takes 
> the one that comes first. It should consider the latest update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15012) NN fails to parse Edit logs after applying HDFS-13101

2019-12-03 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987164#comment-16987164
 ] 

Wei-Chiu Chuang commented on HDFS-15012:


We are actively working on this. However given the nature of the problem it is 
not easy to reproduce in a unit test format.
To be on the safe side, I would suggest reverting HDFS-13101 for now.

Bump the priority to blocker and add release-blocker label to this jira.

> NN fails to parse Edit logs after applying HDFS-13101
> -
>
> Key: HDFS-15012
> URL: https://issues.apache.org/jira/browse/HDFS-15012
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Eric Lin
>Assignee: Shashikant Banerjee
>Priority: Blocker
>  Labels: release-blocker
>
> After applying HDFS-13101, and deleting and creating large number of 
> snapshots, SNN exited with below error:
>   
> {code:sh}
> 2019-11-18 08:28:06,528 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation DeleteSnapshotOp [snapshotRoot=/path/to/hdfs/file, 
> snapshotName=distcp-3479-31-old, 
> RpcClientId=b16a6cb5-bdbb-45ae-9f9a-f7dc57931f37, Rpc
> CallId=1]
> java.lang.AssertionError: Element already exists: 
> element=partition_isactive=true, DELETED=[partition_isactive=true]
> at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:193)
> at org.apache.hadoop.hdfs.util.Diff.delete(Diff.java:239)
> at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:462)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.initChildren(DirectoryWithSnapshotFeature.java:240)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.iterator(DirectoryWithSnapshotFeature.java:250)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:755)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeReference.cleanSubtree(INodeReference.java:332)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.cleanSubtree(INodeReference.java:583)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:760)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:235)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:259)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:301)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:903)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:756)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:324)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1144)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:796)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {code}
> We confirmed that fsimage and edit files were NOT corrupted, as reverting 
> HDFS-13101 fixed the issue. So the logic introduced in HDFS-13101 is broken 
> and failed to parse edit log files.



--
This message was sent 

[jira] [Updated] (HDFS-15012) NN fails to parse Edit logs after applying HDFS-13101

2019-12-03 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15012:
---
Target Version/s: 3.0.4, 3.3.0, 2.8.6, 2.9.3, 3.1.4, 3.2.2, 2.10.1
  Labels: release-blocker  (was: )

> NN fails to parse Edit logs after applying HDFS-13101
> -
>
> Key: HDFS-15012
> URL: https://issues.apache.org/jira/browse/HDFS-15012
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Eric Lin
>Assignee: Shashikant Banerjee
>Priority: Blocker
>  Labels: release-blocker
>
> After applying HDFS-13101, and deleting and creating large number of 
> snapshots, SNN exited with below error:
>   
> {code:sh}
> 2019-11-18 08:28:06,528 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation DeleteSnapshotOp [snapshotRoot=/path/to/hdfs/file, 
> snapshotName=distcp-3479-31-old, 
> RpcClientId=b16a6cb5-bdbb-45ae-9f9a-f7dc57931f37, Rpc
> CallId=1]
> java.lang.AssertionError: Element already exists: 
> element=partition_isactive=true, DELETED=[partition_isactive=true]
> at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:193)
> at org.apache.hadoop.hdfs.util.Diff.delete(Diff.java:239)
> at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:462)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.initChildren(DirectoryWithSnapshotFeature.java:240)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.iterator(DirectoryWithSnapshotFeature.java:250)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:755)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeReference.cleanSubtree(INodeReference.java:332)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.cleanSubtree(INodeReference.java:583)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:760)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:235)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:259)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:301)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:903)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:756)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:324)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1144)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:796)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {code}
> We confirmed that fsimage and edit files were NOT corrupted, as reverting 
> HDFS-13101 fixed the issue. So the logic introduced in HDFS-13101 is broken 
> and failed to parse edit log files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional 

[jira] [Updated] (HDFS-15012) NN fails to parse Edit logs after applying HDFS-13101

2019-12-03 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15012:
---
Priority: Blocker  (was: Critical)

> NN fails to parse Edit logs after applying HDFS-13101
> -
>
> Key: HDFS-15012
> URL: https://issues.apache.org/jira/browse/HDFS-15012
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Eric Lin
>Assignee: Shashikant Banerjee
>Priority: Blocker
>
> After applying HDFS-13101, and deleting and creating large number of 
> snapshots, SNN exited with below error:
>   
> {code:sh}
> 2019-11-18 08:28:06,528 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation DeleteSnapshotOp [snapshotRoot=/path/to/hdfs/file, 
> snapshotName=distcp-3479-31-old, 
> RpcClientId=b16a6cb5-bdbb-45ae-9f9a-f7dc57931f37, Rpc
> CallId=1]
> java.lang.AssertionError: Element already exists: 
> element=partition_isactive=true, DELETED=[partition_isactive=true]
> at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:193)
> at org.apache.hadoop.hdfs.util.Diff.delete(Diff.java:239)
> at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:462)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.initChildren(DirectoryWithSnapshotFeature.java:240)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.iterator(DirectoryWithSnapshotFeature.java:250)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:755)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeReference.cleanSubtree(INodeReference.java:332)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.cleanSubtree(INodeReference.java:583)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:760)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:235)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:259)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:301)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:903)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:756)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:324)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1144)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:796)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {code}
> We confirmed that fsimage and edit files were NOT corrupted, as reverting 
> HDFS-13101 fixed the issue. So the logic introduced in HDFS-13101 is broken 
> and failed to parse edit log files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14957) INodeReference Space Consumed was not same in QuotaUsage and ContentSummary

2019-12-03 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987144#comment-16987144
 ] 

Íñigo Goiri commented on HDFS-14957:


* Can we minimize the diff? Currently lines 889 to 893 are the same as before 
but show as different because of the format, let's avoid that; or if there are 
changes, isolate them.
* We should add a javadoc to getDiskSpaceQuota() explaining the high level and 
probably the issue raised by this JIRA.
* Similar for the test, it would be good to explain what is different from 
before.

> INodeReference Space Consumed was not same in QuotaUsage and ContentSummary
> ---
>
> Key: HDFS-14957
> URL: https://issues.apache.org/jira/browse/HDFS-14957
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.4
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14957.001.patch, HDFS-14957.JPG
>
>
> for INodeReferences , space consumed was different in QuotaUsage and Content 
> Summary 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14901) RBF: Add Encryption Zone related ClientProtocol APIs

2019-12-03 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987140#comment-16987140
 ] 

Íñigo Goiri commented on HDFS-14901:


The test is fairly fast:
https://builds.apache.org/job/PreCommit-HDFS-Build/28435/testReport/org.apache.hadoop.hdfs.server.federation.router/TestRouterEncryptionZone/
I would prefer to use Before and After instead of BeforeClass as it is safer.
In the AfterClass (or After now), we may want to set the cluster to null and 
check if it was null.

> RBF: Add Encryption Zone related ClientProtocol APIs
> 
>
> Key: HDFS-14901
> URL: https://issues.apache.org/jira/browse/HDFS-14901
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14901.001.patch, HDFS-14901.002.patch, 
> HDFS-14901.003.patch
>
>
> Currently listEncryptionZones,reencryptEncryptionZone,listReencryptionStatus 
> these APIs are not implemented in Router.
> This JIRA is intend to implement above mentioned APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130

2019-12-03 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987138#comment-16987138
 ] 

Chao Sun commented on HDFS-14998:
-

[~ferhui] without ZKFC running it is not quite useful to just run the DFS admin 
command to transition the observer to SBN, since it won't be eligible for 
failover unless you also launch the ZKFC on that host.

{quote}
The only benefit for running ZKFC on Observer NameNode is that it would join 
election when the Observer NameNode is transitioned to Standby state.
{quote}
Yes this is correct.

{quote}
>From your words dynamic transition(between observer and standby) can be done 
>only when ZKFC running on it. But i think dynamic transition can be done 
>without ZKFC running. Please correct me, thanks.
{quote}
It can be done without ZKFC running, but with extra steps such as launching 
ZKFC / shutting down ZKFC etc. 

> Update Observer Namenode doc for ZKFC after HDFS-14130
> --
>
> Key: HDFS-14998
> URL: https://issues.apache.org/jira/browse/HDFS-14998
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, 
> HDFS-14998.003.patch
>
>
> After HDFS-14130, we should update observer namenode doc, observer namenode 
> can run with ZKFC running



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15030) Add setQuota operation to HttpFS

2019-12-03 Thread hemanthboyina (Jira)
hemanthboyina created HDFS-15030:


 Summary: Add setQuota operation to HttpFS
 Key: HDFS-15030
 URL: https://issues.apache.org/jira/browse/HDFS-15030
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: hemanthboyina
Assignee: hemanthboyina


setQuota operation is missing in HttpFS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14957) INodeReference Space Consumed was not same in QuotaUsage and ContentSummary

2019-12-03 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987095#comment-16987095
 ] 

hemanthboyina commented on HDFS-14957:
--

attached patch , please review

> INodeReference Space Consumed was not same in QuotaUsage and ContentSummary
> ---
>
> Key: HDFS-14957
> URL: https://issues.apache.org/jira/browse/HDFS-14957
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.4
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14957.001.patch, HDFS-14957.JPG
>
>
> for INodeReferences , space consumed was different in QuotaUsage and Content 
> Summary 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14957) INodeReference Space Consumed was not same in QuotaUsage and ContentSummary

2019-12-03 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-14957:
-
Attachment: HDFS-14957.001.patch
Status: Patch Available  (was: Open)

> INodeReference Space Consumed was not same in QuotaUsage and ContentSummary
> ---
>
> Key: HDFS-14957
> URL: https://issues.apache.org/jira/browse/HDFS-14957
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.4
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14957.001.patch, HDFS-14957.JPG
>
>
> for INodeReferences , space consumed was different in QuotaUsage and Content 
> Summary 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-12-03 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986920#comment-16986920
 ] 

Hadoop QA commented on HDFS-13811:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m  
6s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-13811 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987366/HDFS-13811.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 82775d96bdf0 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0c217fe |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28438/testReport/ |
| Max. process+thread count | 3134 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28438/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RBF: Race condition between router admin quota update and periodic quota 
> update service
> 

[jira] [Comment Edited] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-12-03 Thread huhaiyang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986891#comment-16986891
 ] 

huhaiyang edited comment on HDFS-15024 at 12/3/19 1:26 PM:
---

Current In RetryPolicies Implemented
{code:java}
   /**
 * @return 0 if this is our first failover/retry (i.e., retry immediately),
 * sleep exponentially otherwise
 */
private long getFailoverOrRetrySleepTime(int times) {
  return times == 0 ? 0 : 
calculateExponentialTime(delayMillis, times, maxDelayBase);
}
{code}

   It is reasonable to consider the number of namenode as a condition for 
calculating sleep duration



was (Author: haiyang hu):
Current In RetryPolicies Implemented

 
{code:java}
   /**
 * @return 0 if this is our first failover/retry (i.e., retry immediately),
 * sleep exponentially otherwise
 */
private long getFailoverOrRetrySleepTime(int times) {
  return times == 0 ? 0 : 
calculateExponentialTime(delayMillis, times, maxDelayBase);
}
{code}

   It is reasonable to consider the number of namenode as a condition for 
calculating sleep duration


> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -mkdir /user/haiyang1/test8
> You need to request nn1 when you execute the msync method,
> Actually connect nn2 first and failover is required
> In connection nn3 does not meet the requirements, failover needs to be 
> performed, but at this time, failover operation needs to be performed during 
> a period of hibernation
> Finally, it took a period of hibernation to connect the successful request to 
> nn1
> In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
> default implementation is Sleep time is calculated when more than one 
> failover operation is performed
> I think that the Number of NameNodes as a condition of calculation of sleep 
> time is more reasonable
> That is, in the current test, executing failover on connection nn3 does not 
> need to sleep time to directly connect to the next nn node
> See client_error.log for details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-12-03 Thread huhaiyang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986891#comment-16986891
 ] 

huhaiyang edited comment on HDFS-15024 at 12/3/19 1:26 PM:
---

Current In RetryPolicies Implemented

 
{code:java}
   /**
 * @return 0 if this is our first failover/retry (i.e., retry immediately),
 * sleep exponentially otherwise
 */
private long getFailoverOrRetrySleepTime(int times) {
  return times == 0 ? 0 : 
calculateExponentialTime(delayMillis, times, maxDelayBase);
}
{code}

   It is reasonable to consider the number of namenode as a condition for 
calculating sleep duration



was (Author: haiyang hu):
In RetryPolicies Implemented

/**
 * @return 0 if this is our first failover/retry (i.e., retry immediately),
 * sleep exponentially otherwise
 */
private long getFailoverOrRetrySleepTime(int times) {
  return times == 0 ? 0 : 
calculateExponentialTime(delayMillis, times, maxDelayBase);
}

> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -mkdir /user/haiyang1/test8
> You need to request nn1 when you execute the msync method,
> Actually connect nn2 first and failover is required
> In connection nn3 does not meet the requirements, failover needs to be 
> performed, but at this time, failover operation needs to be performed during 
> a period of hibernation
> Finally, it took a period of hibernation to connect the successful request to 
> nn1
> In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
> default implementation is Sleep time is calculated when more than one 
> failover operation is performed
> I think that the Number of NameNodes as a condition of calculation of sleep 
> time is more reasonable
> That is, in the current test, executing failover on connection nn3 does not 
> need to sleep time to directly connect to the next nn node
> See client_error.log for details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-12-03 Thread huhaiyang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986891#comment-16986891
 ] 

huhaiyang commented on HDFS-15024:
--

In RetryPolicies Implemented

/**
 * @return 0 if this is our first failover/retry (i.e., retry immediately),
 * sleep exponentially otherwise
 */
private long getFailoverOrRetrySleepTime(int times) {
  return times == 0 ? 0 : 
calculateExponentialTime(delayMillis, times, maxDelayBase);
}

> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -mkdir /user/haiyang1/test8
> You need to request nn1 when you execute the msync method,
> Actually connect nn2 first and failover is required
> In connection nn3 does not meet the requirements, failover needs to be 
> performed, but at this time, failover operation needs to be performed during 
> a period of hibernation
> Finally, it took a period of hibernation to connect the successful request to 
> nn1
> In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
> default implementation is Sleep time is calculated when more than one 
> failover operation is performed
> I think that the Number of NameNodes as a condition of calculation of sleep 
> time is more reasonable
> That is, in the current test, executing failover on connection nn3 does not 
> need to sleep time to directly connect to the next nn node
> See client_error.log for details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-12-03 Thread huhaiyang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986882#comment-16986882
 ] 

huhaiyang edited comment on HDFS-15024 at 12/3/19 1:11 PM:
---

[~xkrogen] [~csun] [~vagarychen] Thanks for your comments!
I understand Normally, if we only set 2 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2


Currently,
nn1 is in active state
nn2 is in standby state

when the client connects to nn2, it needs to retry, and will quickly connect to 
nn1. However, when the nn1 fails to connect due to network problems, the next 
time( the third time), the sleep timeout will be performed for a period of time 
to retry

Current HDFS-6440 Support more than 2 NameNodes.
if we set 3 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2,nn3

nn1 is in active state
nn2 is in standby state
nn3 is in standby state(or observer state)

when the client connects to nn2, it needs to retry, and will quickly connect to 
nn3. 
and  the client connects to nn3, it needs to retry, and will quickly connect to 
nn1.
However, when the nn1 fails to connect due to network problems, the next time( 
the fourth time), the sleep timeout will be performed for a period of time to 
retry.

That is to say, it is necessary to connect all the configured NN nodes once.
If no NN nodes the requirements are found, required to perform sleep and 
retry...

In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that 
the Number of NameNodes as a condition of calculation of sleep time is more 
reasonable((which is current v01 patch)).



was (Author: haiyang hu):
[~xkrogen][~csun][~vagarychen] Thanks for your comments!
I understand Normally, if we only set 2 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2


Currently,
nn1 is in active state
nn2 is in standby state

when the client connects to nn2, it needs to retry, and will quickly connect to 
nn1. However, when the nn1 fails to connect due to network problems, the next 
time( the third time), the sleep timeout will be performed for a period of time 
to retry

Current HDFS-6440 Support more than 2 NameNodes.
if we set 3 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2,nn3

nn1 is in active state
nn2 is in standby state
nn3 is in standby state(or observer state)

when the client connects to nn2, it needs to retry, and will quickly connect to 
nn3. 
and  the client connects to nn3, it needs to retry, and will quickly connect to 
nn1.
However, when the nn1 fails to connect due to network problems, the next time( 
the fourth time), the sleep timeout will be performed for a period of time to 
retry.

That is to say, it is necessary to connect all the configured NN nodes once.
If no NN nodes the requirements are found, required to perform sleep and 
retry...

In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that 
the Number of NameNodes as a condition of calculation of sleep time is more 
reasonable((which is current v01 patch)).


> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -mkdir /user/haiyang1/test8
> You need to request nn1 when you execute the msync method,
> Actually connect nn2 first and failover is required
> In connection nn3 does not meet the requirements, failover needs to be 
> performed, but at this time, failover operation needs to be performed during 
> a period of hibernation
> Finally, it took a period of hibernation to connect the successful request to 
> nn1
> In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
> default implementation is Sleep time is calculated when more than one 
> failover operation is performed
> I think that the Number of NameNodes as a condition of calculation of sleep 
> time is more reasonable
> That is, in the current test, executing failover on connection nn3 does not 
> need to sleep time to directly connect to the next nn node
> See client_error.log for details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-12-03 Thread huhaiyang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986882#comment-16986882
 ] 

huhaiyang edited comment on HDFS-15024 at 12/3/19 1:09 PM:
---

[~xkrogen][~csun][~vagarychen] Thanks for your comments!
I understand Normally, if we only set 2 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2


Currently,
nn1 is in active state
nn2 is in standby state

when the client connects to nn2, it needs to retry, and will quickly connect to 
nn1. However, when the nn1 fails to connect due to network problems, the next 
time( the third time), the sleep timeout will be performed for a period of time 
to retry

Current HDFS-6440 Support more than 2 NameNodes.
if we set 3 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2,nn3

nn1 is in active state
nn2 is in standby state
nn3 is in standby state(or observer state)

when the client connects to nn2, it needs to retry, and will quickly connect to 
nn3. 
and  the client connects to nn3, it needs to retry, and will quickly connect to 
nn1.
However, when the nn1 fails to connect due to network problems, the next time( 
the fourth time), the sleep timeout will be performed for a period of time to 
retry.

That is to say, it is necessary to connect all the configured NN nodes once.
If no NN nodes the requirements are found, required to perform sleep and 
retry...

In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that 
the Number of NameNodes as a condition of calculation of sleep time is more 
reasonable((which is current v01 patch)).



was (Author: haiyang hu):
[~xkrogen][~csun][~vagarychen] Thanks for your comments!
I understand Normally, if we only set 2 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2


Currently,
nn1 is in active state
nn2 is in standby state

when the client connects to nn2, it needs to retry, and will quickly connect to 
nn1. However, when the nn1 fails to connect due to network problems, the next 
time( the third time), the sleep timeout will be performed for a period of time 
to retry

Current HDFS-6440 Support more than 2 NameNodes.
if we set 3 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2,nn3

nn1 is in active state
nn2 is in standby state
nn3 is in standby state(or observer state)

when the client connects to nn1, it needs to retry, and will quickly connect to 
nn2. 
and  the client connects to nn2, it needs to retry, and will quickly connect to 
nn1.
However, when the nn1 fails to connect due to network problems, the next time( 
the fourth time), the sleep timeout will be performed for a period of time to 
retry.

That is to say, it is necessary to connect all the configured NN nodes once.
If no NN nodes the requirements are found, required to perform sleep and 
retry...

In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that 
the Number of NameNodes as a condition of calculation of sleep time is more 
reasonable((which is current v01 patch)).


> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -mkdir /user/haiyang1/test8
> You need to request nn1 when you execute the msync method,
> Actually connect nn2 first and failover is required
> In connection nn3 does not meet the requirements, failover needs to be 
> performed, but at this time, failover operation needs to be performed during 
> a period of hibernation
> Finally, it took a period of hibernation to connect the successful request to 
> nn1
> In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
> default implementation is Sleep time is calculated when more than one 
> failover operation is performed
> I think that the Number of NameNodes as a condition of calculation of sleep 
> time is more reasonable
> That is, in the current test, executing failover on connection nn3 does not 
> need to sleep time to directly connect to the next nn node
> See client_error.log for details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-12-03 Thread huhaiyang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986882#comment-16986882
 ] 

huhaiyang commented on HDFS-15024:
--

[~xkrogen][~csun][~vagarychen] Thanks for your comments!
I understand Normally, if we only set 2 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2


Currently,
nn1 is in active state
nn2 is in standby state

when the client connects to nn2, it needs to retry, and will quickly connect to 
nn1. However, when the nn1 fails to connect due to network problems, the next 
time( the third time), the sleep timeout will be performed for a period of time 
to retry

Current HDFS-6440 Support more than 2 NameNodes.
if we set 3 NNS, 

dfs.ha.namenodes.ns1
nn1,nn2,nn3

nn1 is in active state
nn2 is in standby state
nn3 is in standby state(or observer state)

when the client connects to nn1, it needs to retry, and will quickly connect to 
nn2. 
and  the client connects to nn2, it needs to retry, and will quickly connect to 
nn1.
However, when the nn1 fails to connect due to network problems, the next time( 
the fourth time), the sleep timeout will be performed for a period of time to 
retry.

That is to say, it is necessary to connect all the configured NN nodes once.
If no NN nodes the requirements are found, required to perform sleep and 
retry...

In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that 
the Number of NameNodes as a condition of calculation of sleep time is more 
reasonable((which is current v01 patch)).


> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -mkdir /user/haiyang1/test8
> You need to request nn1 when you execute the msync method,
> Actually connect nn2 first and failover is required
> In connection nn3 does not meet the requirements, failover needs to be 
> performed, but at this time, failover operation needs to be performed during 
> a period of hibernation
> Finally, it took a period of hibernation to connect the successful request to 
> nn1
> In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
> default implementation is Sleep time is calculated when more than one 
> failover operation is performed
> I think that the Number of NameNodes as a condition of calculation of sleep 
> time is more reasonable
> That is, in the current test, executing failover on connection nn3 does not 
> need to sleep time to directly connect to the next nn node
> See client_error.log for details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-12-03 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986867#comment-16986867
 ] 

Jinglun commented on HDFS-13811:


Hi [~linyiqun], thanks your nice comments ! Fix the failed unit tests and 
follow your suggestions.  Upload v06.

> RBF: Race condition between router admin quota update and periodic quota 
> update service
> ---
>
> Key: HDFS-13811
> URL: https://issues.apache.org/jira/browse/HDFS-13811
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, 
> HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch, 
> HDFS-13811.004.patch, HDFS-13811.005.patch, HDFS-13811.006.patch
>
>
> If we try to update quota of an existing mount entry and at the same time 
> periodic quota update service is running on the same mount entry, it is 
> leading the mount table to _inconsistent state._
> Here transactions are:
> A - Quota update service is fetching mount table entries.
> B - Quota update service is updating the mount table with current usage.
> A' - User is trying to update quota using admin cmd.
> and the transaction sequence is [ A A' B ]
> quota update service is updating the mount table with old quota value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-12-03 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-13811:
---
Attachment: HDFS-13811.006.patch

> RBF: Race condition between router admin quota update and periodic quota 
> update service
> ---
>
> Key: HDFS-13811
> URL: https://issues.apache.org/jira/browse/HDFS-13811
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, 
> HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch, 
> HDFS-13811.004.patch, HDFS-13811.005.patch, HDFS-13811.006.patch
>
>
> If we try to update quota of an existing mount entry and at the same time 
> periodic quota update service is running on the same mount entry, it is 
> leading the mount table to _inconsistent state._
> Here transactions are:
> A - Quota update service is fetching mount table entries.
> B - Quota update service is updating the mount table with current usage.
> A' - User is trying to update quota using admin cmd.
> and the transaction sequence is [ A A' B ]
> quota update service is updating the mount table with old quota value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2019-12-03 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986840#comment-16986840
 ] 

Stephen O'Donnell commented on HDFS-13671:
--

I have seen a few occurrences of an issue where operations against the 
foldedTreeSet structure (removeAndGet and get) appear to get worse over time.

In once case some time back, I recall a namenode that worked fine for about 4 
weeks and then deletes became painfully slow. The delete code held the NN lock 
for too long per iteration and ultimately blocked IBRs etc. Restarting the NN 
fixed the issue, for about another 4 weeks when it returned. Once the slow down 
started happening it would not clear without a restart.

I recently came across an similar looking issue where the Datanodes (which also 
use foldedTreeSet for the replica map) appeared to slow down on get operations 
from the foldedTreeSet, again a restart seems to have cleared the problem.

This makes me wonder if the issue is that the data structure somehow degrades 
(eg does not balance correctly) over time, rather than being slow from the 
outset. However I have not been able to figure out how that can be the case.

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Priority: Major
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130

2019-12-03 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986832#comment-16986832
 ] 

Fei Hui commented on HDFS-14998:


[~csun] [~ayushtkn] Thanks for your comments.
{quote}
With ZKFC running in both hosts, user just need to run the DFS admin commands 
to achieve that.
{quote}

*Without* ZKFC running, user could also run the DFS admin commands to achieve 
that(transition an observer to standby or transition a standby to observer).Is 
it right?
The only benefit for running ZKFC on Observer NameNode is that it would join 
election when the Observer NameNode is transitioned to Standby state.

>From your words dynamic transition(between observer and standby) can be done 
>only when ZKFC running on it. But i think dynamic transition can be done 
>without ZKFC running. Please correct me, thanks. 

> Update Observer Namenode doc for ZKFC after HDFS-14130
> --
>
> Key: HDFS-14998
> URL: https://issues.apache.org/jira/browse/HDFS-14998
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, 
> HDFS-14998.003.patch
>
>
> After HDFS-14130, we should update observer namenode doc, observer namenode 
> can run with ZKFC running



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15028) Keep the capacity of volume and reduce a system call

2019-12-03 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986774#comment-16986774
 ] 

Yang Yun commented on HDFS-15028:
-

I have no number how much the overhead is reduced by the change for the local 
disk. I think it can be ignored .But we met this issue when we mount a slow 
volume from remote as Archive storage. 

Yes, it break the underlying changing of the volume size. the fix should be 
configurable and disabled by default to keep current behavior.

> Keep the capacity of volume and reduce a system call
> 
>
> Key: HDFS-15028
> URL: https://issues.apache.org/jira/browse/HDFS-15028
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15028.patch
>
>
> The local volume is not changed. so keep the first value of the capacity and 
> reuse for each heartbeat.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-12-03 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986773#comment-16986773
 ] 

Yiqun Lin commented on HDFS-13811:
--

Thanks for updating the patch, [~LiJinglun]. The change almost looks good to 
me. Some minor comments:
 * Can you update the javadoc of quota service class?
{code:java}
/**
 * Service to periodically update the {@link RouterQuotaUsage}
 * cached information in the {@link Router} and update corresponding
 * mount table in State Store.
 */
public class RouterQuotaUpdateService extends PeriodicService {
{code}

 * I don't think we need to change following places since the mount table entry 
queried from state store should already get the quota usage now.
{code:java}
-updatedMountTable = getMountTable(path);
-quota = updatedMountTable.getQuota();
...
-assertEquals(2, mountQuota1.getFileAndDirectoryCount());
+assertEquals(0, mountQuota1.getFileAndDirectoryCount());
{code}

 * Can you look into failed unit test? It's related.

> RBF: Race condition between router admin quota update and periodic quota 
> update service
> ---
>
> Key: HDFS-13811
> URL: https://issues.apache.org/jira/browse/HDFS-13811
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, 
> HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch, 
> HDFS-13811.004.patch, HDFS-13811.005.patch
>
>
> If we try to update quota of an existing mount entry and at the same time 
> periodic quota update service is running on the same mount entry, it is 
> leading the mount table to _inconsistent state._
> Here transactions are:
> A - Quota update service is fetching mount table entries.
> B - Quota update service is updating the mount table with current usage.
> A' - User is trying to update quota using admin cmd.
> and the transaction sequence is [ A A' B ]
> quota update service is updating the mount table with old quota value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org