[jira] [Comment Edited] (HDFS-15468) Active namenode crashed when no edit recover

2020-10-09 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211561#comment-17211561
 ] 

Ayush Saxena edited comment on HDFS-15468 at 10/10/20, 5:50 AM:


Thanx [~kpalanisamy] for the report. Not sure it is related to safemode? I 
could repro this without namenode being in safemode.
 Got some similar exception traces :
{noformat}
127.0.0.1:59233: Can't write, no segment open ; journal id: myjournal
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.checkSync(Journal.java:544)
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:405)
{noformat}
and
{noformat}
2020-10-10 10:30:30,058 [FSEditLogAsync] ERROR namenode.FSEditLog 
(JournalSet.java:mapJournalsAndReportErrors(406)) - Error: flush failed for 
(journal JournalAndStream(mgr=QJM to [127.0.0.1:59233, 127.0.0.1:59235, 
127.0.0.1:59237], stream=QuorumOutputStream starting at txid 1))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions 
to achieve quorum size 2/3. 1 successful responses:
127.0.0.1:59237: null [success]
2 exceptions thrown:
{noformat}
Give a check if it specifically happens with safemode only for you.

[~Amithsha] Let me know, If you want to try reproduce this, I wrote a UT for 
this, you can try that.

Regarding the safemode, It is just preventing you from making a write call to 
the NN, else the NN would have crashed before only, when the JN was down.

I am not sure there is a fix to this, you can't(shouldn't) make the JN's 
recover the last segment, because of so many reasons. Persisting the Namenode 
state and making it call startLogSegment and stuff, is too off road by design

Secondly, I think this is expected as well, If you tend to loose the quorum, 
the Namenode is expected to crash. Traditionally it isn't expected for the 
namenode to loose quorum in any case and if it does it is considered an 
alarming situation

The Admin should ensure maintenance in a way the Namenode doesn't looses the 
quorum. I don't think this would happen in a production environment may be only 
if someone is trying out upgrade without being careful. This is documented as 
well :

{noformat}
JNs is relatively stable and does not require upgrade when upgrading HDFS in 
most of the cases..Upgrading JNs and ZKNs may incur cluster 
downtime.{noformat}



was (Author: ayushtkn):
Thanx [~kpalanisamy] for the report. Not sure it is related to safemode? I 
could repro this without namenode being in safemode.
Got some similar exception traces :

{noformat}
127.0.0.1:59233: Can't write, no segment open ; journal id: myjournal
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.checkSync(Journal.java:544)
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:405)
{noformat}

and 

{noformat}
2020-10-10 10:30:30,058 [FSEditLogAsync] ERROR namenode.FSEditLog 
(JournalSet.java:mapJournalsAndReportErrors(406)) - Error: flush failed for 
(journal JournalAndStream(mgr=QJM to [127.0.0.1:59233, 127.0.0.1:59235, 
127.0.0.1:59237], stream=QuorumOutputStream starting at txid 1))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions 
to achieve quorum size 2/3. 1 successful responses:
127.0.0.1:59237: null [success]
2 exceptions thrown:
{noformat}

Give a check if it specifically happens with safemode only for you.

[~Amithsha] Let me know, If you want to try reproduce this, I wrote a UT for 
this, you can try that.

> Active namenode crashed when no edit recover
> 
>
> Key: HDFS-15468
> URL: https://issues.apache.org/jira/browse/HDFS-15468
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 3.0.0
>Reporter: Karthik Palanisamy
>Priority: Critical
>
> if namenode is under safe mode and let restart two journal node for 
> maintenance activity.
>  In this case, the journal node will not finalize the last edit segment which 
> is edit in-progress. 
>  This last edit segment will be finalized or recovered when edit rolling 
> operation else when epoch change due to namenode failover.
>  But the current scenario is no failover, just namenode is under safe mode. 
> If we leave the safe mode then active namenode will crash.
>  Ie.
>  the current open segment is edits_inprogress_10356376710 but it is 
> not recovered or finalized post JN2 restart. I think we need to recover the 
> edits after JN restart. 
> {code:java}
> Journal node 
> 2020-06-20 16:11:53,458 INFO  server.Journal 
> (Journal.java:scanStorageForLatestEdits(193)) - Latest log is 
> EditLogFile(file=/hadoop/hdfs/journal/xxx/current/edits_inprogress_10356376710,first=10356376710,last=10356376710,inProgress=true,hasCorruptHeader=false)
> 2020-06-20 

[jira] [Commented] (HDFS-15468) Active namenode crashed when no edit recover

2020-10-09 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211561#comment-17211561
 ] 

Ayush Saxena commented on HDFS-15468:
-

Thanx [~kpalanisamy] for the report. Not sure it is related to safemode? I 
could repro this without namenode being in safemode.
Got some similar exception traces :

{noformat}
127.0.0.1:59233: Can't write, no segment open ; journal id: myjournal
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.checkSync(Journal.java:544)
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:405)
{noformat}

and 

{noformat}
2020-10-10 10:30:30,058 [FSEditLogAsync] ERROR namenode.FSEditLog 
(JournalSet.java:mapJournalsAndReportErrors(406)) - Error: flush failed for 
(journal JournalAndStream(mgr=QJM to [127.0.0.1:59233, 127.0.0.1:59235, 
127.0.0.1:59237], stream=QuorumOutputStream starting at txid 1))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions 
to achieve quorum size 2/3. 1 successful responses:
127.0.0.1:59237: null [success]
2 exceptions thrown:
{noformat}

Give a check if it specifically happens with safemode only for you.

[~Amithsha] Let me know, If you want to try reproduce this, I wrote a UT for 
this, you can try that.

> Active namenode crashed when no edit recover
> 
>
> Key: HDFS-15468
> URL: https://issues.apache.org/jira/browse/HDFS-15468
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 3.0.0
>Reporter: Karthik Palanisamy
>Priority: Critical
>
> if namenode is under safe mode and let restart two journal node for 
> maintenance activity.
>  In this case, the journal node will not finalize the last edit segment which 
> is edit in-progress. 
>  This last edit segment will be finalized or recovered when edit rolling 
> operation else when epoch change due to namenode failover.
>  But the current scenario is no failover, just namenode is under safe mode. 
> If we leave the safe mode then active namenode will crash.
>  Ie.
>  the current open segment is edits_inprogress_10356376710 but it is 
> not recovered or finalized post JN2 restart. I think we need to recover the 
> edits after JN restart. 
> {code:java}
> Journal node 
> 2020-06-20 16:11:53,458 INFO  server.Journal 
> (Journal.java:scanStorageForLatestEdits(193)) - Latest log is 
> EditLogFile(file=/hadoop/hdfs/journal/xxx/current/edits_inprogress_10356376710,first=10356376710,last=10356376710,inProgress=true,hasCorruptHeader=false)
> 2020-06-20 16:19:06,397 INFO  ipc.Server (Server.java:logException(2435)) - 
> IPC Server handler 3 on 8485, call 
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.journal from 
> 10.x.x.x:28444 Call#49083225 Retry#0
> org.apache.hadoop.hdfs.qjournal.protocol.JournalOutOfSyncException: Can't 
> write, no segment open
>         at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkSync(Journal.java:484)
> {code}
> {code:java}
> {code:java}
> Namenode log:
> org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many 
> exceptions to achieve quorum size 2/3. 1 successful responses:
> 10.x.x.x:8485: null [success]
> 2 exceptions thrown:
> 10.y.y.y:8485: Can't write, no segment open
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop

2020-10-09 Thread YaYun Wang (Jira)
YaYun Wang created HDFS-15624:
-

 Summary:  Fix the SetQuotaByStorageTypeOp problem after updating 
hadoop 
 Key: HDFS-15624
 URL: https://issues.apache.org/jira/browse/HDFS-15624
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Reporter: YaYun Wang


HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum of 
StorageType. And, setting the quota by storageType depends on the ordinal(), 
therefore, it may cause the setting of quota to be invalid after upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15622) Deleted blocks linger in the replications queue

2020-10-09 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211500#comment-17211500
 ] 

Hadoop QA commented on HDFS-15622:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 32m 
35s{color} |  | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} |  | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch does not contain any @author tags. 
{color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} |  | {color:green} The patch appears to include 2 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
17s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 32s{color} |  | {color:green} branch has no errors when building and 
testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m  
7s{color} |  | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
5s{color} |  | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
10s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} blanks {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch has no blanks issues. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m  6s{color} |  | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
11s{color} |  | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} || ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}110m 51s{color} 
| 

[jira] [Updated] (HDFS-15622) Deleted blocks linger in the replications queue

2020-10-09 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated HDFS-15622:
-
Attachment: HDFS-15622.001.patch
Status: Patch Available  (was: In Progress)

> Deleted blocks linger in the replications queue
> ---
>
> Key: HDFS-15622
> URL: https://issues.apache.org/jira/browse/HDFS-15622
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: HDFS-15622.001.patch
>
>
> We had incident whereas after resolving a missing blocks incident by 
> restarting two dead nodes, there were still 8 missing, but the list was 
> empty. Metasave shows the 8 blocks are "orphaned" meaning the files were 
> already deleted. It is unclear why they were left in the replication queue.
> * The containing node was flaky and started stoped multiple time.
> * The block allocation didn't work well due to the cluster-level storage 
> space exhaustion.
> * The NN was in safe mode.
> Triggering a full block report from the node didn't have any effect. It will 
> clear up if a failover happens as the repl queue will be reinitialized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-15622) Deleted blocks linger in the replications queue

2020-10-09 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-15622 started by Ahmed Hussein.

> Deleted blocks linger in the replications queue
> ---
>
> Key: HDFS-15622
> URL: https://issues.apache.org/jira/browse/HDFS-15622
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>
> We had incident whereas after resolving a missing blocks incident by 
> restarting two dead nodes, there were still 8 missing, but the list was 
> empty. Metasave shows the 8 blocks are "orphaned" meaning the files were 
> already deleted. It is unclear why they were left in the replication queue.
> * The containing node was flaky and started stoped multiple time.
> * The block allocation didn't work well due to the cluster-level storage 
> space exhaustion.
> * The NN was in safe mode.
> Triggering a full block report from the node didn't have any effect. It will 
> clear up if a failover happens as the repl queue will be reinitialized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15623) Respect configured values of rpc.engine

2020-10-09 Thread Hector Sandoval Chaverri (Jira)
Hector Sandoval Chaverri created HDFS-15623:
---

 Summary: Respect configured values of rpc.engine
 Key: HDFS-15623
 URL: https://issues.apache.org/jira/browse/HDFS-15623
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Reporter: Hector Sandoval Chaverri


The HDFS Configuration allows users to specify the RPCEngine implementation to 
use when communicating with Datanodes and Namenodes. However, the value is 
overwritten to ProtobufRpcEngine.class in different classes. As an example in 
NameNodeRpcServer:

{{RPC.setProtocolEngine(conf, ClientNamenodeProtocolPB.class, 
ProtobufRpcEngine.class);}}

{{The configured value of rpc.engine.[protocolName] should be respected to 
allow for other implementations of RPCEngine to be used}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15618) Improve datanode shutdown latency

2020-10-09 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211241#comment-17211241
 ] 

Hadoop QA commented on HDFS-15618:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
24s{color} |  | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} |  | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch does not contain any @author tags. 
{color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} |  | {color:green} The patch appears to include 1 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
45s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
16s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 27s{color} |  | {color:green} branch has no errors when building and 
testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
21s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
13s{color} |  | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
10s{color} |  | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 8s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
11s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} blanks {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch has no blanks issues. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 43s{color} | 
[/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt|https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/222/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt]
 | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 6 new + 
654 unchanged - 0 fixed = 660 total (was 654) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
9s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m  8s{color} |  | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
14s{color} |  | {color:green} the patch passed 

[jira] [Commented] (HDFS-15601) Batch listing: gracefully fallback to use non-batched listing when NameNode doesn't support the feature

2020-10-09 Thread Aihua Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211161#comment-17211161
 ] 

Aihua Xu commented on HDFS-15601:
-

Seems I won't have time to work on this. Assign back.

> Batch listing: gracefully fallback to use non-batched listing when NameNode 
> doesn't support the feature
> ---
>
> Key: HDFS-15601
> URL: https://issues.apache.org/jira/browse/HDFS-15601
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Chao Sun
>Priority: Major
>
> HDFS-13616 requires both server and client side change. However, it is common 
> that users use a newer client to talk to older HDFS (say 2.10). Currently the 
> client will simply fail in this scenario. A better approach, perhaps, is to 
> have client fallback to use non-batched listing on the input directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15601) Batch listing: gracefully fallback to use non-batched listing when NameNode doesn't support the feature

2020-10-09 Thread Aihua Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HDFS-15601:
---

Assignee: (was: Aihua Xu)

> Batch listing: gracefully fallback to use non-batched listing when NameNode 
> doesn't support the feature
> ---
>
> Key: HDFS-15601
> URL: https://issues.apache.org/jira/browse/HDFS-15601
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Chao Sun
>Priority: Major
>
> HDFS-13616 requires both server and client side change. However, it is common 
> that users use a newer client to talk to older HDFS (say 2.10). Currently the 
> client will simply fail in this scenario. A better approach, perhaps, is to 
> have client fallback to use non-batched listing on the input directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15620) RBF: Fix test failures after HADOOP-17281

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15620?focusedWorklogId=498639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498639
 ]

ASF GitHub Bot logged work on HDFS-15620:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 16:16
Start Date: 09/Oct/20 16:16
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#2375:
URL: https://github.com/apache/hadoop/pull/2375#discussion_r502537425



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/fs/contract/router/web/TestRouterWebHDFSContractRootDirectory.java
##
@@ -71,4 +71,9 @@ public void testRmRootRecursive() {
   public void testRmEmptyRootDirRecursive() {
 // It doesn't apply because we still have the mount points here
   }
+
+  @Override
+  public void testSimpleRootListing() {
+// It doesn't apply because DFSRouter dosn't support LISTSTATUS_BATCH.

Review comment:
   prefer an `Assume.assumeTrue(unsupported, false)` so it goes into the 
skipped stats. But, given the rest of the tests are the same, not much of an 
issue.
   
   

##
File path: hadoop-hdfs-project/hadoop-hdfs-rbf/pom.xml
##
@@ -114,6 +114,11 @@ https://maven.apache.org/xsd/maven-4.0.0.xsd;>
   mockito-core
   test
 
+

Review comment:
   assertj is already exported in test scope for hadoop-common. I'd have 
expected it to be picked up.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498639)
Time Spent: 50m  (was: 40m)

> RBF: Fix test failures after HADOOP-17281
> -
>
> Key: HDFS-15620
> URL: https://issues.apache.org/jira/browse/HDFS-15620
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HADOOP-17281 added FileSystem.listStatusIterator API and added its contract 
> test cases. In RBF, the following tests are affected and they are now failing:
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatus
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatusSecure
> * hadoop.fs.contract.router.web.TestRouterWebHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectorySecure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15620) RBF: Fix test failures after HADOOP-17281

2020-10-09 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211122#comment-17211122
 ] 

Steve Loughran commented on HDFS-15620:
---

aah, sorry. We'd deliberately added a blank line in the hdfs module to trigger 
an dfs test run -but clearly that didn't trigger the rbf tests. 


> RBF: Fix test failures after HADOOP-17281
> -
>
> Key: HDFS-15620
> URL: https://issues.apache.org/jira/browse/HDFS-15620
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> HADOOP-17281 added FileSystem.listStatusIterator API and added its contract 
> test cases. In RBF, the following tests are affected and they are now failing:
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatus
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatusSecure
> * hadoop.fs.contract.router.web.TestRouterWebHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectorySecure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15621) Datanode DirectoryScanner uses excessive memory

2020-10-09 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211086#comment-17211086
 ] 

Stephen O'Donnell commented on HDFS-15621:
--

[~belugabehr] That is a good idea. I wonder if it could be made to work ...

Right now, we scan the disks and build up a large list of ScanInfo objects, 
which is really a list of paths with blockfile and metafile and blockfile size.

Then later, we take a snapshot of the blocks in memory and then:

1. Look for blocks in memory and not on disk.

2. Look for blocks on disk and not in memory.

3. Look for mis-matches between genstamps etc.

With this queue approach, the tricky part is (1) as we need the full list of 
what is on disk to be able to find what is missing, but we could solve that by 
storing all the blocks we have seen on disk and then comparing at the end.

As the scanner stands now, it is not perfect either, as its finds a bunch of 
false positive differences. This is because the disk scan takes a long time, 
and the disk and memory state is continuously changing.

If we took a disk scan result, and immediately compared it with memory, then we 
would cut down on those false positive differences and save a lot of memory.

> Datanode DirectoryScanner uses excessive memory
> ---
>
> Key: HDFS-15621
> URL: https://issues.apache.org/jira/browse/HDFS-15621
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: Screenshot 2020-10-09 at 14.11.36.png, Screenshot 
> 2020-10-09 at 15.20.56.png
>
>
> We generally work a rule of 1GB heap on a datanode per 1M blocks. For nodes 
> with a lot of blocks, this can mean a lot of heap.
> We recently captured a heapdump of a DN with about 22M blocks and found only 
> about 1.5GB was occupied by the ReplicaMap. Another 9GB of the heap is taken 
> by the DirectoryScanner ScanInfo objects. Most of this memory was alloated to 
> strings.
> Checking the strings in question, we can see two strings per scanInfo, 
> looking like:
> {code}
> /current/BP-671271071-10.163.205.13-1552020401842/current/finalized/subdir28/subdir17/blk_1180438785
> _106716708.meta
> {code}
> I will update a screen shot from MAT showing this.
> For the first string especially, the part 
> "/current/BP-671271071-10.163.205.13-1552020401842/current/finalized/" will 
> be the same for every block in the block pool as the scanner is only 
> concerned about finalized blocks.
> We can probably also store just the subdir indexes "28" and "27" rather than 
> "subdir28/subdir17" and then construct the path when it is requested via the 
> getter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15621) Datanode DirectoryScanner uses excessive memory

2020-10-09 Thread David Mollitor (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211069#comment-17211069
 ] 

David Mollitor commented on HDFS-15621:
---

Another option would be to multi-thread the operation and use a blocking queue 
to regulate memory consumption.

 

Multiple threads are scanning directories, and pumping results into a queue.  
One or more thread processes the data in the queue.  If the queue is full, 
scanners block.  In this way, the number of objects that exist at one time is 
controlled.

> Datanode DirectoryScanner uses excessive memory
> ---
>
> Key: HDFS-15621
> URL: https://issues.apache.org/jira/browse/HDFS-15621
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: Screenshot 2020-10-09 at 14.11.36.png, Screenshot 
> 2020-10-09 at 15.20.56.png
>
>
> We generally work a rule of 1GB heap on a datanode per 1M blocks. For nodes 
> with a lot of blocks, this can mean a lot of heap.
> We recently captured a heapdump of a DN with about 22M blocks and found only 
> about 1.5GB was occupied by the ReplicaMap. Another 9GB of the heap is taken 
> by the DirectoryScanner ScanInfo objects. Most of this memory was alloated to 
> strings.
> Checking the strings in question, we can see two strings per scanInfo, 
> looking like:
> {code}
> /current/BP-671271071-10.163.205.13-1552020401842/current/finalized/subdir28/subdir17/blk_1180438785
> _106716708.meta
> {code}
> I will update a screen shot from MAT showing this.
> For the first string especially, the part 
> "/current/BP-671271071-10.163.205.13-1552020401842/current/finalized/" will 
> be the same for every block in the block pool as the scanner is only 
> concerned about finalized blocks.
> We can probably also store just the subdir indexes "28" and "27" rather than 
> "subdir28/subdir17" and then construct the path when it is requested via the 
> getter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15618) Improve datanode shutdown latency

2020-10-09 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated HDFS-15618:
-
Status: Patch Available  (was: In Progress)

> Improve datanode shutdown latency
> -
>
> Key: HDFS-15618
> URL: https://issues.apache.org/jira/browse/HDFS-15618
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: HDFS-15618.001.patch
>
>
> The shutdown of Datanode is a very long latency. A block scanner waits for 5 
> minutes to join on each VolumeScanner thread.
> Since the scanners are daemon threads and do not alter the block content, it 
> is safe to ignore such conditions on shutdown of Datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15622) Deleted blocks linger in the replications queue

2020-10-09 Thread Ahmed Hussein (Jira)
Ahmed Hussein created HDFS-15622:


 Summary: Deleted blocks linger in the replications queue
 Key: HDFS-15622
 URL: https://issues.apache.org/jira/browse/HDFS-15622
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Reporter: Ahmed Hussein
Assignee: Ahmed Hussein


We had incident whereas after resolving a missing blocks incident by restarting 
two dead nodes, there were still 8 missing, but the list was empty. Metasave 
shows the 8 blocks are "orphaned" meaning the files were already deleted. It is 
unclear why they were left in the replication queue.

* The containing node was flaky and started stoped multiple time.
* The block allocation didn't work well due to the cluster-level storage space 
exhaustion.
* The NN was in safe mode.

Triggering a full block report from the node didn't have any effect. It will 
clear up if a failover happens as the repl queue will be reinitialized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15618) Improve datanode shutdown latency

2020-10-09 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211046#comment-17211046
 ] 

Ahmed Hussein edited comment on HDFS-15618 at 10/9/20, 2:23 PM:


I added a configuration key {{dfs.block.scanner.volume.join.timeout.ms}} that 
controls the duration the thread times out waiting to join on a 
{{VolumeScanner}} thread. This value is only used within 
{{BlockScanner.removeAllVolumeScanners()}}.
This parameter can be used to switch between "fast mode" vs "slow mode".
A small value guarantees that the {{Datanode}} will proceed to shutdown without 
waiting for the {{VolumeScanner}} to finish.

The default value is set to 1 minute. I tried setting the default value to 500 
ms but that would break some test cases that are expecting some timely behavior.


was (Author: ahussein):
I added a configuration key {{dfs.block.scanner.volume.join.timeout.ms}} that 
controls the duration the thread times out waiting to join on a 
{{VolumeScanner}} thread. This value is only used within 
{{BlockScanner.removeAllVolumeScanners()}}.
This parameter can be used to switch between "fast mode" vs "slow mode".
A small value guarantees that the {{Datanode}} will proceed to shutdown without 
waiting for the {{VolumeScanner}} to finish.

> Improve datanode shutdown latency
> -
>
> Key: HDFS-15618
> URL: https://issues.apache.org/jira/browse/HDFS-15618
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: HDFS-15618.001.patch
>
>
> The shutdown of Datanode is a very long latency. A block scanner waits for 5 
> minutes to join on each VolumeScanner thread.
> Since the scanners are daemon threads and do not alter the block content, it 
> is safe to ignore such conditions on shutdown of Datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15618) Improve datanode shutdown latency

2020-10-09 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211046#comment-17211046
 ] 

Ahmed Hussein commented on HDFS-15618:
--

I added a configuration key {{dfs.block.scanner.volume.join.timeout.ms}} that 
controls the duration the thread times out waiting to join on a 
{{VolumeScanner}} thread. This value is only used within 
{{BlockScanner.removeAllVolumeScanners()}}.
This parameter can be used to switch between "fast mode" vs "slow mode".
A small value guarantees that the {{Datanode}} will proceed to shutdown without 
waiting for the {{VolumeScanner}} to finish.

> Improve datanode shutdown latency
> -
>
> Key: HDFS-15618
> URL: https://issues.apache.org/jira/browse/HDFS-15618
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: HDFS-15618.001.patch
>
>
> The shutdown of Datanode is a very long latency. A block scanner waits for 5 
> minutes to join on each VolumeScanner thread.
> Since the scanners are daemon threads and do not alter the block content, it 
> is safe to ignore such conditions on shutdown of Datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15621) Datanode DirectoryScanner uses excessive memory

2020-10-09 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-15621:
-
Attachment: Screenshot 2020-10-09 at 15.20.56.png

> Datanode DirectoryScanner uses excessive memory
> ---
>
> Key: HDFS-15621
> URL: https://issues.apache.org/jira/browse/HDFS-15621
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: Screenshot 2020-10-09 at 14.11.36.png, Screenshot 
> 2020-10-09 at 15.20.56.png
>
>
> We generally work a rule of 1GB heap on a datanode per 1M blocks. For nodes 
> with a lot of blocks, this can mean a lot of heap.
> We recently captured a heapdump of a DN with about 22M blocks and found only 
> about 1.5GB was occupied by the ReplicaMap. Another 9GB of the heap is taken 
> by the DirectoryScanner ScanInfo objects. Most of this memory was alloated to 
> strings.
> Checking the strings in question, we can see two strings per scanInfo, 
> looking like:
> {code}
> /current/BP-671271071-10.163.205.13-1552020401842/current/finalized/subdir28/subdir17/blk_1180438785
> _106716708.meta
> {code}
> I will update a screen shot from MAT showing this.
> For the first string especially, the part 
> "/current/BP-671271071-10.163.205.13-1552020401842/current/finalized/" will 
> be the same for every block in the block pool as the scanner is only 
> concerned about finalized blocks.
> We can probably also store just the subdir indexes "28" and "27" rather than 
> "subdir28/subdir17" and then construct the path when it is requested via the 
> getter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15621) Datanode DirectoryScanner uses excessive memory

2020-10-09 Thread Stephen O'Donnell (Jira)
Stephen O'Donnell created HDFS-15621:


 Summary: Datanode DirectoryScanner uses excessive memory
 Key: HDFS-15621
 URL: https://issues.apache.org/jira/browse/HDFS-15621
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.4.0
Reporter: Stephen O'Donnell
Assignee: Stephen O'Donnell
 Attachments: Screenshot 2020-10-09 at 14.11.36.png

We generally work a rule of 1GB heap on a datanode per 1M blocks. For nodes 
with a lot of blocks, this can mean a lot of heap.

We recently captured a heapdump of a DN with about 22M blocks and found only 
about 1.5GB was occupied by the ReplicaMap. Another 9GB of the heap is taken by 
the DirectoryScanner ScanInfo objects. Most of this memory was alloated to 
strings.

Checking the strings in question, we can see two strings per scanInfo, looking 
like:

{code}
/current/BP-671271071-10.163.205.13-1552020401842/current/finalized/subdir28/subdir17/blk_1180438785
_106716708.meta
{code}

I will update a screen shot from MAT showing this.

For the first string especially, the part 
"/current/BP-671271071-10.163.205.13-1552020401842/current/finalized/" will be 
the same for every block in the block pool as the scanner is only concerned 
about finalized blocks.

We can probably also store just the subdir indexes "28" and "27" rather than 
"subdir28/subdir17" and then construct the path when it is requested via the 
getter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15621) Datanode DirectoryScanner uses excessive memory

2020-10-09 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-15621:
-
Attachment: Screenshot 2020-10-09 at 14.11.36.png

> Datanode DirectoryScanner uses excessive memory
> ---
>
> Key: HDFS-15621
> URL: https://issues.apache.org/jira/browse/HDFS-15621
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: Screenshot 2020-10-09 at 14.11.36.png
>
>
> We generally work a rule of 1GB heap on a datanode per 1M blocks. For nodes 
> with a lot of blocks, this can mean a lot of heap.
> We recently captured a heapdump of a DN with about 22M blocks and found only 
> about 1.5GB was occupied by the ReplicaMap. Another 9GB of the heap is taken 
> by the DirectoryScanner ScanInfo objects. Most of this memory was alloated to 
> strings.
> Checking the strings in question, we can see two strings per scanInfo, 
> looking like:
> {code}
> /current/BP-671271071-10.163.205.13-1552020401842/current/finalized/subdir28/subdir17/blk_1180438785
> _106716708.meta
> {code}
> I will update a screen shot from MAT showing this.
> For the first string especially, the part 
> "/current/BP-671271071-10.163.205.13-1552020401842/current/finalized/" will 
> be the same for every block in the block pool as the scanner is only 
> concerned about finalized blocks.
> We can probably also store just the subdir indexes "28" and "27" rather than 
> "subdir28/subdir17" and then construct the path when it is requested via the 
> getter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15620) RBF: Fix test failures after HADOOP-17281

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15620?focusedWorklogId=498555=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498555
 ]

ASF GitHub Bot logged work on HDFS-15620:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:17
Start Date: 09/Oct/20 14:17
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2375:
URL: https://github.com/apache/hadoop/pull/2375#issuecomment-706138293


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  28m 56s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  29m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 27s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 41s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m 23s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 15s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m 13s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   0m 29s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 17s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  13m 59s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   1m 13s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   8m 21s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 33s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 109m 56s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2375/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2375 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient xml findbugs checkstyle |
   | uname | Linux 9ff351c4f1ec 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 518a212cfff |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2375/1/testReport/ |
   | Max. process+thread count | 3144 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 

[jira] [Work logged] (HDFS-15614) Initialize snapshot trash root during NameNode startup if enabled

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15614?focusedWorklogId=498500=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498500
 ]

ASF GitHub Bot logged work on HDFS-15614:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:12
Start Date: 09/Oct/20 14:12
Worklog Time Spent: 10m 
  Work Description: smengcl commented on a change in pull request #2370:
URL: https://github.com/apache/hadoop/pull/2370#discussion_r501900056



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
##
@@ -8524,6 +8530,39 @@ void checkAccess(String src, FsAction mode) throws 
IOException {
 logAuditEvent(true, operationName, src);
   }
 
+  /**
+   * Check if snapshot roots are created for all existing snapshottable
+   * directories. Create them if not.
+   */
+  void checkAndProvisionSnapshotTrashRoots() throws IOException {
+if (haEnabled) {
+  if (!inActiveState()) {

Review comment:
   I am not 100% sure about this condition check. Any 
suggestions/confirmations?
   
   The goal is to only let **Active NN** to check and provision snapshot trash 
roots. The assumption is that the `mkdirs()` call below propagates the write to 
standby NameNode.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498500)
Time Spent: 1.5h  (was: 1h 20m)

> Initialize snapshot trash root during NameNode startup if enabled
> -
>
> Key: HDFS-15614
> URL: https://issues.apache.org/jira/browse/HDFS-15614
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This is a follow-up to HDFS-15607.
> Goal:
> Initialize (create) snapshot trash root for all existing snapshottable 
> directories if {{dfs.namenode.snapshot.trashroot.enabled}} is set to 
> {{true}}. So admins won't have to run {{dfsadmin -provisionTrash}} manually 
> on all those existing snapshottable directories.
> The change is expected to land in {{FSNamesystem}}.
> Discussion:
> 1. Currently in HDFS-15607, the snapshot trash root creation logic is on the 
> client side. But in order for NN to create it at startup, the logic must 
> (also) be implemented on the server side as well. -- which is also a 
> requirement by WebHDFS (HDFS-15612).
> 2. Alternatively, we can provide an extra parameter to the 
> {{-provisionTrash}} command like: {{dfsadmin -provisionTrash -all}} to 
> initialize/provision trash root on all existing snapshottable dirs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15620) RBF: Fix test failures after HADOOP-17281

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15620?focusedWorklogId=498418=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498418
 ]

ASF GitHub Bot logged work on HDFS-15620:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:05
Start Date: 09/Oct/20 14:05
Worklog Time Spent: 10m 
  Work Description: aajisaka opened a new pull request #2375:
URL: https://github.com/apache/hadoop/pull/2375


   JIRA: https://issues.apache.org/jira/browse/HDFS-15620
   
   * Add assertj to test dependency
   * Skip testSimpleRootListing in TestRouterWebHDFSContractRootDirectory



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498418)
Time Spent: 0.5h  (was: 20m)

> RBF: Fix test failures after HADOOP-17281
> -
>
> Key: HDFS-15620
> URL: https://issues.apache.org/jira/browse/HDFS-15620
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HADOOP-17281 added FileSystem.listStatusIterator API and added its contract 
> test cases. In RBF, the following tests are affected and they are now failing:
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatus
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatusSecure
> * hadoop.fs.contract.router.web.TestRouterWebHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectorySecure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15616) [SBN] Disable Observers to trigger edit log roll

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15616?focusedWorklogId=498401=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498401
 ]

ASF GitHub Bot logged work on HDFS-15616:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:04
Start Date: 09/Oct/20 14:04
Worklog Time Spent: 10m 
  Work Description: symious opened a new pull request #2373:
URL: https://github.com/apache/hadoop/pull/2373


   Jira ticket: https://issues.apache.org/jira/browse/HDFS-15616



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498401)
Time Spent: 40m  (was: 0.5h)

> [SBN] Disable Observers to trigger edit log roll
> 
>
> Key: HDFS-15616
> URL: https://issues.apache.org/jira/browse/HDFS-15616
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Janus Chow
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15616.001.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently when Observer is transitioned from StandbyState, the editLogTailer 
> will still send the request to roll editLog to ActiveNN, which should be 
> disabled to keep the definition of "logRollPeriodMs" clear.
> One thing I'm not sure is for a cluster with multi standby Namenode, all the 
> standby NN will trigger the roll. Should this feature be extended to all 
> standby NNs or implementing on observers first?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15616) [SBN] Disable Observers to trigger edit log roll

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15616?focusedWorklogId=498365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498365
 ]

ASF GitHub Bot logged work on HDFS-15616:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:01
Start Date: 09/Oct/20 14:01
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2373:
URL: https://github.com/apache/hadoop/pull/2373#issuecomment-705988371


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  29m 16s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  29m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 49s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m 46s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   3m 55s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 51s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   1m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 58s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 28s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  18m 34s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 47s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 113m 56s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2373/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 44s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 234m  2s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestAclsEndToEnd |
   |   | hadoop.hdfs.TestDecommissionWithStriped |
   |   | hadoop.hdfs.TestFileChecksumCompositeCrc |
   |   | hadoop.hdfs.TestLeaseRecovery2 |
   |   | hadoop.hdfs.TestErasureCodingPolicyWithSnapshot |
   |   | hadoop.hdfs.tools.TestECAdmin |
   |   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
   |   | hadoop.hdfs.TestErasureCodingExerciseAPIs |
   |   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
   |   | hadoop.hdfs.TestReadStripedFileWithDNFailure |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
   |   | hadoop.hdfs.TestFileChecksum |
   |   | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
   |   | hadoop.hdfs.TestDFSStripedOutputStream |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.hdfs.web.TestWebHDFS |
   |   | hadoop.hdfs.tools.TestDFSAdminWithHA |
   |   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
   
   
   | Subsystem | 

[jira] [Work logged] (HDFS-15614) Initialize snapshot trash root during NameNode startup if enabled

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15614?focusedWorklogId=498316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498316
 ]

ASF GitHub Bot logged work on HDFS-15614:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:57
Start Date: 09/Oct/20 13:57
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on a change in pull request #2370:
URL: https://github.com/apache/hadoop/pull/2370#discussion_r502332901



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
##
@@ -2031,6 +2033,10 @@ private String metaSaveAsString() {
 return sw.toString();
   }
 
+  public boolean getIsSnapshotTrashRootEnabled() {

Review comment:
   getIsSnapshotTrashRootEnabled --> isSnapshotTrashRootEnabled??

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
##
@@ -781,6 +781,10 @@ protected void initialize(Configuration conf) throws 
IOException {
   }
 }
 
+if (namesystem.getIsSnapshotTrashRootEnabled()) {

Review comment:
   how about doing this here:
@Override
   public void startActiveServices() throws IOException {
 try {
   namesystem.startActiveServices();
   startTrashEmptier(getConf());
 } catch (Throwable t) {
   doImmediateShutdown(t);
 }
   }
   
   Just before starting the trashEmptier thread. We don't need to check for 
Active or standby state here as these should be called in only Active NameNode.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
##
@@ -781,6 +781,10 @@ protected void initialize(Configuration conf) throws 
IOException {
   }
 }
 
+if (namesystem.getIsSnapshotTrashRootEnabled()) {

Review comment:
   how about doing this here:
   ```
@Override
   public void startActiveServices() throws IOException {
 try {
   namesystem.startActiveServices();
   startTrashEmptier(getConf());
 } catch (Throwable t) {
   doImmediateShutdown(t);
 }
   }
   ```
   
   just before starting the trashEmptier thread. We don't need to check for 
Active or standby state here as these should be called in only Active NameNode.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498316)
Time Spent: 1h 20m  (was: 1h 10m)

> Initialize snapshot trash root during NameNode startup if enabled
> -
>
> Key: HDFS-15614
> URL: https://issues.apache.org/jira/browse/HDFS-15614
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This is a follow-up to HDFS-15607.
> Goal:
> Initialize (create) snapshot trash root for all existing snapshottable 
> directories if {{dfs.namenode.snapshot.trashroot.enabled}} is set to 
> {{true}}. So admins won't have to run {{dfsadmin -provisionTrash}} manually 
> on all those existing snapshottable directories.
> The change is expected to land in {{FSNamesystem}}.
> Discussion:
> 1. Currently in HDFS-15607, the snapshot trash root creation logic is on the 
> client side. But in order for NN to create it at startup, the logic must 
> (also) be implemented on the server side as well. -- which is also a 
> requirement by WebHDFS (HDFS-15612).
> 2. Alternatively, we can provide an extra parameter to the 
> {{-provisionTrash}} command like: {{dfsadmin -provisionTrash -all}} to 
> initialize/provision trash root on all existing snapshottable dirs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=498313=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498313
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:57
Start Date: 09/Oct/20 13:57
Worklog Time Spent: 10m 
  Work Description: ferhui commented on a change in pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#discussion_r501443141



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java
##
@@ -519,6 +525,20 @@ private Object invokeMethod(
 }
   }
 
+  /**
+   * For Tracking which is the actual client address.
+   * It adds key/value (clientIp/"ip") pair to the caller context.
+   */
+  private void appendClientIpToCallerContext() {
+final CallerContext ctx = CallerContext.getCurrent();
+String origContext = ctx == null ? null : ctx.getContext();
+byte[] origSignature = ctx == null ? null : ctx.getSignature();
+CallerContext.setCurrent(
+new CallerContext.Builder(origContext, clientConfiguration)

Review comment:
   OK, fixed, please review again, thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498313)
Time Spent: 4h 40m  (was: 4.5h)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=498284=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498284
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:54
Start Date: 09/Oct/20 13:54
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#issuecomment-705642911


   The filesystem contract test failures are related to HADOOP-17281.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498284)
Time Spent: 4.5h  (was: 4h 20m)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15614) Initialize snapshot trash root during NameNode startup if enabled

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15614?focusedWorklogId=498264=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498264
 ]

ASF GitHub Bot logged work on HDFS-15614:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:53
Start Date: 09/Oct/20 13:53
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2370:
URL: https://github.com/apache/hadoop/pull/2370#issuecomment-705817726


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 29s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  29m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 17s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 30s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   3m  1s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   2m 59s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  1s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   1m  1s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 50s | 
[/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2370/1/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 499 unchanged 
- 0 fixed = 500 total (was 499)  |
   | +1 :green_heart: |  mvnsite  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  13m 49s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 45s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   2m 59s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 108m 55s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2370/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 48s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 192m 20s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
   |   | hadoop.hdfs.TestFileChecksum |
   |   | 
hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor |
   |   | hadoop.hdfs.TestDFSShell |
   |   | hadoop.hdfs.TestFileChecksumCompositeCrc |
   |   | hadoop.hdfs.TestGetFileChecksum |
   |   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
   |   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
   |   | hadoop.hdfs.web.TestWebHDFS |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2370/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2370 |
   | Optional 

[jira] [Work logged] (HDFS-15610) Reduce datanode upgrade/hardlink thread

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15610?focusedWorklogId=498258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498258
 ]

ASF GitHub Bot logged work on HDFS-15610:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:52
Start Date: 09/Oct/20 13:52
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #2365:
URL: https://github.com/apache/hadoop/pull/2365#issuecomment-705382799


   The failed tests pass locally.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498258)
Time Spent: 1h 10m  (was: 1h)

> Reduce datanode upgrade/hardlink thread
> ---
>
> Key: HDFS-15610
> URL: https://issues.apache.org/jira/browse/HDFS-15610
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 3.1.4
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> There is a kernel overhead on datanode upgrade. If datanode with millions of 
> blocks and 10+ disks then block-layout migration will be super expensive 
> during its hardlink operation.  Slowness is observed when running with large 
> hardlink threads(dfs.datanode.block.id.layout.upgrade.threads, default is 12 
> thread for each disk) and its runs for 2+ hours. 
> I.e 10*12=120 threads (for 10 disks)
> Small test:
> RHEL7, 32 cores, 20 GB RAM, 8 GB DN heap
> ||dfs.datanode.block.id.layout.upgrade.threads||Blocks||Disks||Time taken||
> |12|3.3 Million|1|2 minutes and 59 seconds|
> |6|3.3 Million|1|2 minutes and 35 seconds|
> |3|3.3 Million|1|2 minutes and 51 seconds|
> Tried same test twice and 95% is accurate (only a few sec difference on each 
> iteration). Using 6 thread is faster than 12 thread because of its overhead. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15610) Reduce datanode upgrade/hardlink thread

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15610?focusedWorklogId=498226=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498226
 ]

ASF GitHub Bot logged work on HDFS-15610:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:50
Start Date: 09/Oct/20 13:50
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 merged pull request #2365:
URL: https://github.com/apache/hadoop/pull/2365


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498226)
Time Spent: 1h  (was: 50m)

> Reduce datanode upgrade/hardlink thread
> ---
>
> Key: HDFS-15610
> URL: https://issues.apache.org/jira/browse/HDFS-15610
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 3.1.4
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> There is a kernel overhead on datanode upgrade. If datanode with millions of 
> blocks and 10+ disks then block-layout migration will be super expensive 
> during its hardlink operation.  Slowness is observed when running with large 
> hardlink threads(dfs.datanode.block.id.layout.upgrade.threads, default is 12 
> thread for each disk) and its runs for 2+ hours. 
> I.e 10*12=120 threads (for 10 disks)
> Small test:
> RHEL7, 32 cores, 20 GB RAM, 8 GB DN heap
> ||dfs.datanode.block.id.layout.upgrade.threads||Blocks||Disks||Time taken||
> |12|3.3 Million|1|2 minutes and 59 seconds|
> |6|3.3 Million|1|2 minutes and 35 seconds|
> |3|3.3 Million|1|2 minutes and 51 seconds|
> Tried same test twice and 95% is accurate (only a few sec difference on each 
> iteration). Using 6 thread is faster than 12 thread because of its overhead. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=498088=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498088
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:38
Start Date: 09/Oct/20 13:38
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#issuecomment-705414755







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498088)
Time Spent: 4h 20m  (was: 4h 10m)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=498034=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498034
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:29
Start Date: 09/Oct/20 13:29
Worklog Time Spent: 10m 
  Work Description: ferhui merged pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498034)
Time Spent: 4h 10m  (was: 4h)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15614) Initialize snapshot trash root during NameNode startup if enabled

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15614?focusedWorklogId=498029=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498029
 ]

ASF GitHub Bot logged work on HDFS-15614:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:28
Start Date: 09/Oct/20 13:28
Worklog Time Spent: 10m 
  Work Description: smengcl opened a new pull request #2370:
URL: https://github.com/apache/hadoop/pull/2370


   https://issues.apache.org/jira/browse/HDFS-15614
   
   Added `checkAndProvisionSnapshotTrashRoots` in `FSNamesystem`. Triggered in 
`NameNode` on startup if `dfs.namenode.snapshot.trashroot.enabled` set to 
`true`.
   
   UT added in `TestDistributedFileSystem`.
   
   The logic is ready for review.
   I am considering adding another UT for HA case.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498029)
Time Spent: 1h  (was: 50m)

> Initialize snapshot trash root during NameNode startup if enabled
> -
>
> Key: HDFS-15614
> URL: https://issues.apache.org/jira/browse/HDFS-15614
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This is a follow-up to HDFS-15607.
> Goal:
> Initialize (create) snapshot trash root for all existing snapshottable 
> directories if {{dfs.namenode.snapshot.trashroot.enabled}} is set to 
> {{true}}. So admins won't have to run {{dfsadmin -provisionTrash}} manually 
> on all those existing snapshottable directories.
> The change is expected to land in {{FSNamesystem}}.
> Discussion:
> 1. Currently in HDFS-15607, the snapshot trash root creation logic is on the 
> client side. But in order for NN to create it at startup, the logic must 
> (also) be implemented on the server side as well. -- which is also a 
> requirement by WebHDFS (HDFS-15612).
> 2. Alternatively, we can provide an extra parameter to the 
> {{-provisionTrash}} command like: {{dfsadmin -provisionTrash -all}} to 
> initialize/provision trash root on all existing snapshottable dirs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=498005=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498005
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:26
Start Date: 09/Oct/20 13:26
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#issuecomment-705398703


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 30s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   6m 11s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  24m  3s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  17m  1s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 42s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m 54s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 25s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 39s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 59s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |  18m 59s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  16m 58s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |  16m 58s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 42s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 19s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  14m 16s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 44s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 50s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   9m 38s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  |   8m 53s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2363/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 57s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 184m 54s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.fs.contract.router.TestRouterHDFSContractRootDirectory |
   |   | hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatusSecure |
   |   | hadoop.fs.contract.router.TestRouterHDFSContractRootDirectorySecure |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   |   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractRootDirectory |
   |   | hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatus |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2363/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2363 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | 

[jira] [Work logged] (HDFS-15620) RBF: Fix test failures after HADOOP-17281

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15620?focusedWorklogId=497919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497919
 ]

ASF GitHub Bot logged work on HDFS-15620:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 11:56
Start Date: 09/Oct/20 11:56
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2375:
URL: https://github.com/apache/hadoop/pull/2375#issuecomment-706138293


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  28m 56s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  29m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 27s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 41s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m 23s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 15s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m 13s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   0m 29s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 17s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  13m 59s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   1m 13s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   8m 21s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 33s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 109m 56s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2375/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2375 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient xml findbugs checkstyle |
   | uname | Linux 9ff351c4f1ec 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 518a212cfff |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2375/1/testReport/ |
   | Max. process+thread count | 3144 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 

[jira] [Work logged] (HDFS-15614) Initialize snapshot trash root during NameNode startup if enabled

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15614?focusedWorklogId=497874=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497874
 ]

ASF GitHub Bot logged work on HDFS-15614:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 10:35
Start Date: 09/Oct/20 10:35
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on a change in pull request #2370:
URL: https://github.com/apache/hadoop/pull/2370#discussion_r502334643



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
##
@@ -781,6 +781,10 @@ protected void initialize(Configuration conf) throws 
IOException {
   }
 }
 
+if (namesystem.getIsSnapshotTrashRootEnabled()) {

Review comment:
   how about doing this here:
   ```
@Override
   public void startActiveServices() throws IOException {
 try {
   namesystem.startActiveServices();
   startTrashEmptier(getConf());
 } catch (Throwable t) {
   doImmediateShutdown(t);
 }
   }
   ```
   
   just before starting the trashEmptier thread. We don't need to check for 
Active or standby state here as these should be called in only Active NameNode.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497874)
Time Spent: 50m  (was: 40m)

> Initialize snapshot trash root during NameNode startup if enabled
> -
>
> Key: HDFS-15614
> URL: https://issues.apache.org/jira/browse/HDFS-15614
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This is a follow-up to HDFS-15607.
> Goal:
> Initialize (create) snapshot trash root for all existing snapshottable 
> directories if {{dfs.namenode.snapshot.trashroot.enabled}} is set to 
> {{true}}. So admins won't have to run {{dfsadmin -provisionTrash}} manually 
> on all those existing snapshottable directories.
> The change is expected to land in {{FSNamesystem}}.
> Discussion:
> 1. Currently in HDFS-15607, the snapshot trash root creation logic is on the 
> client side. But in order for NN to create it at startup, the logic must 
> (also) be implemented on the server side as well. -- which is also a 
> requirement by WebHDFS (HDFS-15612).
> 2. Alternatively, we can provide an extra parameter to the 
> {{-provisionTrash}} command like: {{dfsadmin -provisionTrash -all}} to 
> initialize/provision trash root on all existing snapshottable dirs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15614) Initialize snapshot trash root during NameNode startup if enabled

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15614?focusedWorklogId=497873=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497873
 ]

ASF GitHub Bot logged work on HDFS-15614:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 10:35
Start Date: 09/Oct/20 10:35
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on a change in pull request #2370:
URL: https://github.com/apache/hadoop/pull/2370#discussion_r502332901



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
##
@@ -2031,6 +2033,10 @@ private String metaSaveAsString() {
 return sw.toString();
   }
 
+  public boolean getIsSnapshotTrashRootEnabled() {

Review comment:
   getIsSnapshotTrashRootEnabled --> isSnapshotTrashRootEnabled??

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
##
@@ -781,6 +781,10 @@ protected void initialize(Configuration conf) throws 
IOException {
   }
 }
 
+if (namesystem.getIsSnapshotTrashRootEnabled()) {

Review comment:
   how about doing this here:
@Override
   public void startActiveServices() throws IOException {
 try {
   namesystem.startActiveServices();
   startTrashEmptier(getConf());
 } catch (Throwable t) {
   doImmediateShutdown(t);
 }
   }
   
   Just before starting the trashEmptier thread. We don't need to check for 
Active or standby state here as these should be called in only Active NameNode.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497873)
Time Spent: 40m  (was: 0.5h)

> Initialize snapshot trash root during NameNode startup if enabled
> -
>
> Key: HDFS-15614
> URL: https://issues.apache.org/jira/browse/HDFS-15614
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This is a follow-up to HDFS-15607.
> Goal:
> Initialize (create) snapshot trash root for all existing snapshottable 
> directories if {{dfs.namenode.snapshot.trashroot.enabled}} is set to 
> {{true}}. So admins won't have to run {{dfsadmin -provisionTrash}} manually 
> on all those existing snapshottable directories.
> The change is expected to land in {{FSNamesystem}}.
> Discussion:
> 1. Currently in HDFS-15607, the snapshot trash root creation logic is on the 
> client side. But in order for NN to create it at startup, the logic must 
> (also) be implemented on the server side as well. -- which is also a 
> requirement by WebHDFS (HDFS-15612).
> 2. Alternatively, we can provide an extra parameter to the 
> {{-provisionTrash}} command like: {{dfsadmin -provisionTrash -all}} to 
> initialize/provision trash root on all existing snapshottable dirs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15620) RBF: Fix test failures after HADOOP-17281

2020-10-09 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-15620:


Assignee: Akira Ajisaka

> RBF: Fix test failures after HADOOP-17281
> -
>
> Key: HDFS-15620
> URL: https://issues.apache.org/jira/browse/HDFS-15620
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>
> HADOOP-17281 added FileSystem.listStatusIterator API and added its contract 
> test cases. In RBF, the following tests are affected and they are now failing:
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatus
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatusSecure
> * hadoop.fs.contract.router.web.TestRouterWebHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectorySecure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15620) RBF: Fix test failures after HADOOP-17281

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-15620:
--
Labels: pull-request-available  (was: )

> RBF: Fix test failures after HADOOP-17281
> -
>
> Key: HDFS-15620
> URL: https://issues.apache.org/jira/browse/HDFS-15620
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HADOOP-17281 added FileSystem.listStatusIterator API and added its contract 
> test cases. In RBF, the following tests are affected and they are now failing:
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatus
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatusSecure
> * hadoop.fs.contract.router.web.TestRouterWebHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectorySecure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15620) RBF: Fix test failures after HADOOP-17281

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15620?focusedWorklogId=497860=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497860
 ]

ASF GitHub Bot logged work on HDFS-15620:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 10:05
Start Date: 09/Oct/20 10:05
Worklog Time Spent: 10m 
  Work Description: aajisaka opened a new pull request #2375:
URL: https://github.com/apache/hadoop/pull/2375


   JIRA: https://issues.apache.org/jira/browse/HDFS-15620
   
   * Add assertj to test dependency
   * Skip testSimpleRootListing in TestRouterWebHDFSContractRootDirectory



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497860)
Remaining Estimate: 0h
Time Spent: 10m

> RBF: Fix test failures after HADOOP-17281
> -
>
> Key: HDFS-15620
> URL: https://issues.apache.org/jira/browse/HDFS-15620
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HADOOP-17281 added FileSystem.listStatusIterator API and added its contract 
> test cases. In RBF, the following tests are affected and they are now failing:
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatus
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatusSecure
> * hadoop.fs.contract.router.web.TestRouterWebHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectorySecure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-09 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-13293:
---
Fix Version/s: (was: 3.4.)
   3.4.0

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-09 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210722#comment-17210722
 ] 

Hui Fei commented on HDFS-13293:


Merged. [~maobaolong] Thanks for report, [~aajisaka] [~elgoiri] Thanks for 
review!

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-09 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-13293:
---
Fix Version/s: 3.4.
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.
>
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=497810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497810
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 08:12
Start Date: 09/Oct/20 08:12
Worklog Time Spent: 10m 
  Work Description: ferhui merged pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497810)
Time Spent: 3h 50m  (was: 3h 40m)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15620) RBF: Fix test failures after HADOOP-17281

2020-10-09 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15620:
-
Target Version/s: 3.3.1, 3.4.0

> RBF: Fix test failures after HADOOP-17281
> -
>
> Key: HDFS-15620
> URL: https://issues.apache.org/jira/browse/HDFS-15620
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Priority: Major
>
> HADOOP-17281 added FileSystem.listStatusIterator API and added its contract 
> test cases. In RBF, the following tests are affected and they are now failing:
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatus
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatusSecure
> * hadoop.fs.contract.router.web.TestRouterWebHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectorySecure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15620) RBF: Fix test failures after HADOOP-17281

2020-10-09 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15620:
-
Parent: HDFS-14603
Issue Type: Sub-task  (was: Bug)

> RBF: Fix test failures after HADOOP-17281
> -
>
> Key: HDFS-15620
> URL: https://issues.apache.org/jira/browse/HDFS-15620
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Priority: Major
>
> HADOOP-17281 added FileSystem.listStatusIterator API and added its contract 
> test cases. In RBF, the following tests are affected and they are now failing:
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatus
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatusSecure
> * hadoop.fs.contract.router.web.TestRouterWebHDFSContractRootDirectory
> * hadoop.fs.contract.router.TestRouterHDFSContractRootDirectorySecure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15620) RBF: Fix test failures after HADOOP-17281

2020-10-09 Thread Akira Ajisaka (Jira)
Akira Ajisaka created HDFS-15620:


 Summary: RBF: Fix test failures after HADOOP-17281
 Key: HDFS-15620
 URL: https://issues.apache.org/jira/browse/HDFS-15620
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Akira Ajisaka


HADOOP-17281 added FileSystem.listStatusIterator API and added its contract 
test cases. In RBF, the following tests are affected and they are now failing:

* hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatus
* hadoop.fs.contract.router.TestRouterHDFSContractRootDirectory
* hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatusSecure
* hadoop.fs.contract.router.web.TestRouterWebHDFSContractRootDirectory
* hadoop.fs.contract.router.TestRouterHDFSContractRootDirectorySecure




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7343) HDFS smart storage management

2020-10-09 Thread Feilong He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210639#comment-17210639
 ] 

Feilong He commented on HDFS-7343:
--

Hi Brahma, currently we have no plan to merge this feature to upstream. We have 
a repo to maintain this project. See https://github.com/Intel-bigdata/SSM 

> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
>Priority: Major
> Attachments: HDFS-Smart-Storage-Management-update.pdf, 
> HDFS-Smart-Storage-Management.pdf, 
> HDFSSmartStorageManagement-General-20170315.pdf, 
> HDFSSmartStorageManagement-Phase1-20170315.pdf, access_count_tables.jpg, 
> move.jpg, tables_in_ssm.xlsx
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15616) [SBN] Disable Observers to trigger edit log roll

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15616?focusedWorklogId=497774=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497774
 ]

ASF GitHub Bot logged work on HDFS-15616:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 06:07
Start Date: 09/Oct/20 06:07
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2373:
URL: https://github.com/apache/hadoop/pull/2373#issuecomment-705988371


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  29m 16s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  29m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 49s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m 46s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   3m 55s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 51s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   1m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 58s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 28s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  18m 34s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 47s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 113m 56s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2373/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 44s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 234m  2s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestAclsEndToEnd |
   |   | hadoop.hdfs.TestDecommissionWithStriped |
   |   | hadoop.hdfs.TestFileChecksumCompositeCrc |
   |   | hadoop.hdfs.TestLeaseRecovery2 |
   |   | hadoop.hdfs.TestErasureCodingPolicyWithSnapshot |
   |   | hadoop.hdfs.tools.TestECAdmin |
   |   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
   |   | hadoop.hdfs.TestErasureCodingExerciseAPIs |
   |   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
   |   | hadoop.hdfs.TestReadStripedFileWithDNFailure |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
   |   | hadoop.hdfs.TestFileChecksum |
   |   | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
   |   | hadoop.hdfs.TestDFSStripedOutputStream |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.hdfs.web.TestWebHDFS |
   |   | hadoop.hdfs.tools.TestDFSAdminWithHA |
   |   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
   
   
   | Subsystem |