[jira] [Commented] (HDFS-15599) RBF: Add API to expose resolved destinations (namespace) in Router

2020-09-25 Thread Fengnan Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202523#comment-17202523
 ] 

Fengnan Li commented on HDFS-15599:
---

Thanks for the pointer [~ayushtkn] [~elgoiri] 

Actually I have thought about HDFS-14249 which provides the ability from Router 
admin, but not for the clients. To solve our problem, a programmatically 
accessible API is needed. Think about when Hive wants to rename to promote a 
table from staging to prod, it needs to check the targeted location is in the 
same cluster or not with the cluster holding the staging db.

> RBF: Add API to expose resolved destinations (namespace) in Router
> --
>
> Key: HDFS-15599
> URL: https://issues.apache.org/jira/browse/HDFS-15599
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
>
> We have seen quite often requests like where a path in Router is actually 
> pointed. Two main use cases are:
> 1) Calculate the HDFS capacity usage allocation of all Hive tables, whose 
> have onboarded to Router.
> 2) A failure prevention method for cross-cluster rename. First check the 
> source HDFS location and dest HDFS location, and then issue a distcp cmd if 
> possible to avoid the Exception.
> Inside Router, the function getLocationsForPath does the work but it is 
> internal only and not visible to Clients.
> RouterAdmin has getMountTableEntries but this is a cast of Mount table 
> without any resolving.
>  
> We are proposing adding such an API, and there are two ways:
> 1) Adding this API in RouterRpcServer, which requires the change in 
> ClientNameNodeProtocol to include this new API. 
> 2) Adding this API in RouterAdminServer, which requires the a protocol 
> between Client and the admin server.
>  
> There is one existing resolvePath in FileSystem which can be used to 
> implement this call from client side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15598) ViewHDFS#canonicalizeUri should not be restricted to DFS only API.

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15598?focusedWorklogId=491521=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491521
 ]

ASF GitHub Bot logged work on HDFS-15598:
-

Author: ASF GitHub Bot
Created on: 26/Sep/20 04:22
Start Date: 26/Sep/20 04:22
Worklog Time Spent: 10m 
  Work Description: umamaheswararao commented on pull request #2339:
URL: https://github.com/apache/hadoop/pull/2339#issuecomment-699373992


   Thank you @jojochuang for the review! I have just committed it to trunk.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491521)
Time Spent: 0.5h  (was: 20m)

> ViewHDFS#canonicalizeUri should not be restricted to DFS only API.
> --
>
> Key: HDFS-15598
> URL: https://issues.apache.org/jira/browse/HDFS-15598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As part of HIve Partitions verification, insert failed due to canonicalizeUri 
> restricted to DFS only. This can be relaxed and delegate to 
> vfs#canonicalizeUri



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15598) ViewHDFS#canonicalizeUri should not be restricted to DFS only API.

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15598?focusedWorklogId=491520=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491520
 ]

ASF GitHub Bot logged work on HDFS-15598:
-

Author: ASF GitHub Bot
Created on: 26/Sep/20 04:21
Start Date: 26/Sep/20 04:21
Worklog Time Spent: 10m 
  Work Description: umamaheswararao merged pull request #2339:
URL: https://github.com/apache/hadoop/pull/2339


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491520)
Time Spent: 20m  (was: 10m)

> ViewHDFS#canonicalizeUri should not be restricted to DFS only API.
> --
>
> Key: HDFS-15598
> URL: https://issues.apache.org/jira/browse/HDFS-15598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As part of HIve Partitions verification, insert failed due to canonicalizeUri 
> restricted to DFS only. This can be relaxed and delegate to 
> vfs#canonicalizeUri



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15600) TestRouterQuota fails in trunk

2020-09-25 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202477#comment-17202477
 ] 

Mingliang Liu commented on HDFS-15600:
--

As discussed in [this 
comment,|https://issues.apache.org/jira/browse/HDFS-15025?focusedCommentId=17202476=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17202476]
 I'm +1 on the idea to fix. Thanks,

> TestRouterQuota fails in trunk
> --
>
> Key: HDFS-15600
> URL: https://issues.apache.org/jira/browse/HDFS-15600
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Reporter: Ayush Saxena
>Priority: Major
>
> The test is failing due to addition of a new storage type {{NVDIMM}} in 
> middle.
> Ref :
> https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/204/testReport/org.apache.hadoop.hdfs.server.federation.router/TestRouterQuota/testStorageTypeQuota/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15025) Applying NVDIMM storage media to HDFS

2020-09-25 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202476#comment-17202476
 ] 

Mingliang Liu edited comment on HDFS-15025 at 9/26/20, 1:07 AM:


[~ayushtkn] This is a good question.

First, I did not see code that depends on the ordinal of the enums, given users 
configure disks with storage name and set the storage policy for directories. 
The existing disk type names and storage polices kept their name and ordinal. 
So I was not thinking this was a "Incompatible" change - even it adds new 
fields ({{isRam}}) to this class. Meanwhile, in the code comment, it says the 
type is sorted by speed, not fixed ordinal.
{code}
@InterfaceStability.Unstable
public enum StorageType {
  // sorted by the speed of the storage types, from fast to slow
{code}

Second, I was not aware of the test failure in multiple previous QA runs of the 
patch (in the pull request). I did not check the last QA run, but would be glad 
if we can find out why this was not reported in PreCommit runs. [~wangyayun] 
Would have a look the test failure and/or why it was not reported previously? I 
glimpsed and now think the test just because it makes assumptions about the 
ordinal - which makes sense as the {{quote}} array was covering complete types 
previously. I think your proposal works great and fixes the test, so +1 on the 
idea to fix.

CC: [~brahmareddy]


was (Author: liuml07):
[~ayushtkn] This is good question.

First, I did not see code that depends on the ordinal of the enums, given users 
configure disks with storage name and set the storage policy for directories. 
The existing disk type names and storage polices kept their name and ordinal. 
So I was not thinking this was a "Incompatible" change - even it adds new 
fields ({{isRam}}) to this class. Meanwhile, in the code comment, it says the 
type is sorted by speed, not fixed ordinal.
{code}
@InterfaceStability.Unstable
public enum StorageType {
  // sorted by the speed of the storage types, from fast to slow
{code}

Second, I was not aware of the test failure in multiple previous QA runs of the 
patch (in the pull request). I did not check the last QA run, but would be glad 
if we can find out why this was not reported in PreCommit runs. [~wangyayun] 
Would have a look? I glimpsed and think the test just because it makes 
assumptions about the ordinal. I think your proposal works just fine and fixes 
the test, so +1 on the idea.

CC: [~brahmareddy]

> Applying NVDIMM storage media to HDFS
> -
>
> Key: HDFS-15025
> URL: https://issues.apache.org/jira/browse/HDFS-15025
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: YaYun Wang
>Assignee: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, 
> HDFS-15025.002.patch, HDFS-15025.003.patch, HDFS-15025.004.patch, 
> HDFS-15025.005.patch, HDFS-15025.006.patch, NVDIMM_patch(WIP).patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> The non-volatile memory NVDIMM is faster than SSD, it can be used 
> simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on 
> NVDIMM can not only improves the response rate of HDFS, but also ensure the 
> reliability of the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15025) Applying NVDIMM storage media to HDFS

2020-09-25 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202476#comment-17202476
 ] 

Mingliang Liu commented on HDFS-15025:
--

[~ayushtkn] This is good question.

First, I did not see code that depends on the ordinal of the enums, given users 
configure disks with storage name and set the storage policy for directories. 
The existing disk type names and storage polices kept their name and ordinal. 
So I was not thinking this was a "Incompatible" change - even it adds new 
fields ({{isRam}}) to this class. Meanwhile, in the code comment, it says the 
type is sorted by speed, not fixed ordinal.
{code}
@InterfaceStability.Unstable
public enum StorageType {
  // sorted by the speed of the storage types, from fast to slow
{code}

Second, I was not aware of the test failure in multiple previous QA runs of the 
patch (in the pull request). I did not check the last QA run, but would be glad 
if we can find out why this was not reported in PreCommit runs. [~wangyayun] 
Would have a look? I glimpsed and think the test just because it makes 
assumptions about the ordinal. I think your proposal works just fine and fixes 
the test, so +1 on the idea.

CC: [~brahmareddy]

> Applying NVDIMM storage media to HDFS
> -
>
> Key: HDFS-15025
> URL: https://issues.apache.org/jira/browse/HDFS-15025
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: YaYun Wang
>Assignee: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, 
> HDFS-15025.002.patch, HDFS-15025.003.patch, HDFS-15025.004.patch, 
> HDFS-15025.005.patch, HDFS-15025.006.patch, NVDIMM_patch(WIP).patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> The non-volatile memory NVDIMM is faster than SSD, it can be used 
> simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on 
> NVDIMM can not only improves the response rate of HDFS, but also ensure the 
> reliability of the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15598) ViewHDFS#canonicalizeUri should not be restricted to DFS only API.

2020-09-25 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202454#comment-17202454
 ] 

Hadoop QA commented on HDFS-15598:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
17s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to 
include any new or modified tests. Please justify why no new tests are needed 
for this patch. Also please list what manual steps were performed to verify 
this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
33s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 
39s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 23m 
46s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
33s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 5s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
39s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
22m 20s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
21s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
10s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m 
48s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
18s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
51s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 23m 
29s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 23m 
29s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
42s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 19m 
42s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 2s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
33s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | 

[jira] [Created] (HDFS-15601) Batch listing: gracefully fallback to use non-batched listing when NameNode doesn't support the feature

2020-09-25 Thread Chao Sun (Jira)
Chao Sun created HDFS-15601:
---

 Summary: Batch listing: gracefully fallback to use non-batched 
listing when NameNode doesn't support the feature
 Key: HDFS-15601
 URL: https://issues.apache.org/jira/browse/HDFS-15601
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs
Reporter: Chao Sun


HDFS-13616 requires both server and client side change. However, it is common 
that users use a newer client to talk to older HDFS (say 2.10). Currently the 
client will simply fail in this scenario. A better approach, perhaps, is to 
have client fallback to use non-batched listing on the input directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15598) ViewHDFS#canonicalizeUri should not be restricted to DFS only API.

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15598?focusedWorklogId=491387=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491387
 ]

ASF GitHub Bot logged work on HDFS-15598:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 19:44
Start Date: 25/Sep/20 19:44
Worklog Time Spent: 10m 
  Work Description: umamaheswararao opened a new pull request #2339:
URL: https://github.com/apache/hadoop/pull/2339


   https://issues.apache.org/jira/browse/HDFS-15598



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491387)
Remaining Estimate: 0h
Time Spent: 10m

> ViewHDFS#canonicalizeUri should not be restricted to DFS only API.
> --
>
> Key: HDFS-15598
> URL: https://issues.apache.org/jira/browse/HDFS-15598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As part of HIve Partitions verification, insert failed due to canonicalizeUri 
> restricted to DFS only. This can be relaxed and delegate to 
> vfs#canonicalizeUri



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15598) ViewHDFS#canonicalizeUri should not be restricted to DFS only API.

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-15598:
--
Labels: pull-request-available  (was: )

> ViewHDFS#canonicalizeUri should not be restricted to DFS only API.
> --
>
> Key: HDFS-15598
> URL: https://issues.apache.org/jira/browse/HDFS-15598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As part of HIve Partitions verification, insert failed due to canonicalizeUri 
> restricted to DFS only. This can be relaxed and delegate to 
> vfs#canonicalizeUri



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15600) TestRouterQuota fails in trunk

2020-09-25 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202368#comment-17202368
 ] 

Ayush Saxena commented on HDFS-15600:
-

A simple fix :
Just add one extra -1 to the array in {{verifyTypeQuotaAndConsume}} at the 
start, and an extra 0 at the start for the second argument for the last two 
occurrences of  {{verifyTypeQuotaAndConsume}}

If anybody wants to take this up, Feel free to take it ahead.

> TestRouterQuota fails in trunk
> --
>
> Key: HDFS-15600
> URL: https://issues.apache.org/jira/browse/HDFS-15600
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Reporter: Ayush Saxena
>Priority: Major
>
> The test is failing due to addition of a new storage type {{NVDIMM}} in 
> middle.
> Ref :
> https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/204/testReport/org.apache.hadoop.hdfs.server.federation.router/TestRouterQuota/testStorageTypeQuota/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15025) Applying NVDIMM storage media to HDFS

2020-09-25 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202364#comment-17202364
 ] 

Ayush Saxena commented on HDFS-15025:
-

This breaks {{TestRouterQuota#testStorageTypeQuota}}
https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/204/testReport/org.apache.hadoop.hdfs.server.federation.router/TestRouterQuota/testStorageTypeQuota/

{code:java}
-  RAM_DISK(true),
-  SSD(false),
-  DISK(false),
-  ARCHIVE(false),
-  PROVIDED(false);
+  RAM_DISK(true, true),
+  NVDIMM(false, true),
{code}

I didn't give a check to the code, But I have a question, Why is {{NVDIMM}} 
added in between rather than being in the end?
This changes the {{ordinal}} of the enums, Potentially can break others code as 
well, If there is no very specific reason. Please consider moving it to the 
end. I couldn't spare time to check if putting it between would be an 
incompatible change or not.

Anyway, I have created HDFS-15600 for the test. I would leave deciding on the 
position of the storage type for the folks involved here.

> Applying NVDIMM storage media to HDFS
> -
>
> Key: HDFS-15025
> URL: https://issues.apache.org/jira/browse/HDFS-15025
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: YaYun Wang
>Assignee: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, 
> HDFS-15025.002.patch, HDFS-15025.003.patch, HDFS-15025.004.patch, 
> HDFS-15025.005.patch, HDFS-15025.006.patch, NVDIMM_patch(WIP).patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> The non-volatile memory NVDIMM is faster than SSD, it can be used 
> simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on 
> NVDIMM can not only improves the response rate of HDFS, but also ensure the 
> reliability of the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15600) TestRouterQuota fails in trunk

2020-09-25 Thread Ayush Saxena (Jira)
Ayush Saxena created HDFS-15600:
---

 Summary: TestRouterQuota fails in trunk
 Key: HDFS-15600
 URL: https://issues.apache.org/jira/browse/HDFS-15600
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: rbf
Reporter: Ayush Saxena


The test is failing due to addition of a new storage type {{NVDIMM}} in middle.
Ref :
https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/204/testReport/org.apache.hadoop.hdfs.server.federation.router/TestRouterQuota/testStorageTypeQuota/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15599) RBF: Add API to expose resolved destinations (namespace) in Router

2020-09-25 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202295#comment-17202295
 ] 

Íñigo Goiri commented on HDFS-15599:


Thanks [~ayushtkn] for reminding me of HDFS-14249, yes that's actually a good 
base.
[~fengnanli] is there anything that is missing with that API?
We can polish it if there is some use case that is not covered.
In any case, I would use this JIRA to expose this through the Router Web UI or 
so.

> RBF: Add API to expose resolved destinations (namespace) in Router
> --
>
> Key: HDFS-15599
> URL: https://issues.apache.org/jira/browse/HDFS-15599
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
>
> We have seen quite often requests like where a path in Router is actually 
> pointed. Two main use cases are:
> 1) Calculate the HDFS capacity usage allocation of all Hive tables, whose 
> have onboarded to Router.
> 2) A failure prevention method for cross-cluster rename. First check the 
> source HDFS location and dest HDFS location, and then issue a distcp cmd if 
> possible to avoid the Exception.
> Inside Router, the function getLocationsForPath does the work but it is 
> internal only and not visible to Clients.
> RouterAdmin has getMountTableEntries but this is a cast of Mount table 
> without any resolving.
>  
> We are proposing adding such an API, and there are two ways:
> 1) Adding this API in RouterRpcServer, which requires the change in 
> ClientNameNodeProtocol to include this new API. 
> 2) Adding this API in RouterAdminServer, which requires the a protocol 
> between Client and the admin server.
>  
> There is one existing resolvePath in FileSystem which can be used to 
> implement this call from client side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15594) Lazy calculate live datanodes in safe mode tip

2020-09-25 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202286#comment-17202286
 ] 

Íñigo Goiri commented on HDFS-15594:


Thanks [~NickyYe] for the patch and thanks [~hexiaoqiao] and [~ayushtkn] for 
the review.
Merged the PR.

> Lazy calculate live datanodes in safe mode tip
> --
>
> Key: HDFS-15594
> URL: https://issues.apache.org/jira/browse/HDFS-15594
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Ye Ni
>Assignee: Ye Ni
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Safe mode tip is printed every 20 seconds.
> This change is to calculate live datanodes until reported block threshold is 
> meet.
>  Old 
> {code:java}
> STATE* Safe mode ON. The reported blocks 111054015 needs additional 27902753 
> blocks to reach the threshold 0.9990 of total blocks 139095856. The number of 
> live datanodes 2531 has reached the minimum number 1. Safe mode will be 
> turned off automatically once the thresholds have been reached.{code}
> New 
> {code:java}
> STATE* Safe mode ON. 
> The reported blocks 134851250 needs additional 3218494 blocks to reach the 
> threshold 0.9990 of total blocks 138207947.
> The number of live datanodes is not calculated since reported blocks hasn't 
> reached the threshold. Safe mode will be turned off automatically once the 
> thresholds have been reached.{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15594) Lazy calculate live datanodes in safe mode tip

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15594?focusedWorklogId=491316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491316
 ]

ASF GitHub Bot logged work on HDFS-15594:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 16:48
Start Date: 25/Sep/20 16:48
Worklog Time Spent: 10m 
  Work Description: goiri merged pull request #2332:
URL: https://github.com/apache/hadoop/pull/2332


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491316)
Time Spent: 1.5h  (was: 1h 20m)

> Lazy calculate live datanodes in safe mode tip
> --
>
> Key: HDFS-15594
> URL: https://issues.apache.org/jira/browse/HDFS-15594
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Ye Ni
>Assignee: Ye Ni
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Safe mode tip is printed every 20 seconds.
> This change is to calculate live datanodes until reported block threshold is 
> meet.
>  Old 
> {code:java}
> STATE* Safe mode ON. The reported blocks 111054015 needs additional 27902753 
> blocks to reach the threshold 0.9990 of total blocks 139095856. The number of 
> live datanodes 2531 has reached the minimum number 1. Safe mode will be 
> turned off automatically once the thresholds have been reached.{code}
> New 
> {code:java}
> STATE* Safe mode ON. 
> The reported blocks 134851250 needs additional 3218494 blocks to reach the 
> threshold 0.9990 of total blocks 138207947.
> The number of live datanodes is not calculated since reported blocks hasn't 
> reached the threshold. Safe mode will be turned off automatically once the 
> thresholds have been reached.{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15594) Lazy calculate live datanodes in safe mode tip

2020-09-25 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-15594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-15594:
---
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Lazy calculate live datanodes in safe mode tip
> --
>
> Key: HDFS-15594
> URL: https://issues.apache.org/jira/browse/HDFS-15594
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Ye Ni
>Assignee: Ye Ni
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Safe mode tip is printed every 20 seconds.
> This change is to calculate live datanodes until reported block threshold is 
> meet.
>  Old 
> {code:java}
> STATE* Safe mode ON. The reported blocks 111054015 needs additional 27902753 
> blocks to reach the threshold 0.9990 of total blocks 139095856. The number of 
> live datanodes 2531 has reached the minimum number 1. Safe mode will be 
> turned off automatically once the thresholds have been reached.{code}
> New 
> {code:java}
> STATE* Safe mode ON. 
> The reported blocks 134851250 needs additional 3218494 blocks to reach the 
> threshold 0.9990 of total blocks 138207947.
> The number of live datanodes is not calculated since reported blocks hasn't 
> reached the threshold. Safe mode will be turned off automatically once the 
> thresholds have been reached.{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15594) Lazy calculate live datanodes in safe mode tip

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15594?focusedWorklogId=491315=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491315
 ]

ASF GitHub Bot logged work on HDFS-15594:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 16:47
Start Date: 25/Sep/20 16:47
Worklog Time Spent: 10m 
  Work Description: goiri commented on pull request #2332:
URL: https://github.com/apache/hadoop/pull/2332#issuecomment-699035579


   There were a lot of failures because TestFileChecksum:
   
https://ci-hadoop.apache.org/job/hadoop-multibranch/view/change-requests/job/PR-2332/4/testReport/
   But they are not related.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491315)
Time Spent: 1h 20m  (was: 1h 10m)

> Lazy calculate live datanodes in safe mode tip
> --
>
> Key: HDFS-15594
> URL: https://issues.apache.org/jira/browse/HDFS-15594
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Ye Ni
>Assignee: Ye Ni
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Safe mode tip is printed every 20 seconds.
> This change is to calculate live datanodes until reported block threshold is 
> meet.
>  Old 
> {code:java}
> STATE* Safe mode ON. The reported blocks 111054015 needs additional 27902753 
> blocks to reach the threshold 0.9990 of total blocks 139095856. The number of 
> live datanodes 2531 has reached the minimum number 1. Safe mode will be 
> turned off automatically once the thresholds have been reached.{code}
> New 
> {code:java}
> STATE* Safe mode ON. 
> The reported blocks 134851250 needs additional 3218494 blocks to reach the 
> threshold 0.9990 of total blocks 138207947.
> The number of live datanodes is not calculated since reported blocks hasn't 
> reached the threshold. Safe mode will be turned off automatically once the 
> thresholds have been reached.{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15548) Allow configuring DISK/ARCHIVE storage types on same device mount

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15548?focusedWorklogId=491037=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491037
 ]

ASF GitHub Bot logged work on HDFS-15548:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:37
Start Date: 25/Sep/20 13:37
Worklog Time Spent: 10m 
  Work Description: LeonGao91 commented on pull request #2288:
URL: https://github.com/apache/hadoop/pull/2288#issuecomment-698620014


   @Hexiaoqiao Thanks for the comments! I have replied and please let me know 
if it makes sense to you.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491037)
Time Spent: 5h 20m  (was: 5h 10m)

> Allow configuring DISK/ARCHIVE storage types on same device mount
> -
>
> Key: HDFS-15548
> URL: https://issues.apache.org/jira/browse/HDFS-15548
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Leon Gao
>Assignee: Leon Gao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> We can allow configuring DISK/ARCHIVE storage types on the same device mount 
> on two separate directories.
> Users should be able to configure the capacity for each. Also, the datanode 
> usage report should report stats correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15025) Applying NVDIMM storage media to HDFS

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15025?focusedWorklogId=491118=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491118
 ]

ASF GitHub Bot logged work on HDFS-15025:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:43
Start Date: 25/Sep/20 13:43
Worklog Time Spent: 10m 
  Work Description: brahmareddybattula merged pull request #2189:
URL: https://github.com/apache/hadoop/pull/2189


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491118)
Time Spent: 12h 10m  (was: 12h)

> Applying NVDIMM storage media to HDFS
> -
>
> Key: HDFS-15025
> URL: https://issues.apache.org/jira/browse/HDFS-15025
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: YaYun Wang
>Assignee: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, 
> HDFS-15025.002.patch, HDFS-15025.003.patch, HDFS-15025.004.patch, 
> HDFS-15025.005.patch, HDFS-15025.006.patch, NVDIMM_patch(WIP).patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> The non-volatile memory NVDIMM is faster than SSD, it can be used 
> simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on 
> NVDIMM can not only improves the response rate of HDFS, but also ensure the 
> reliability of the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15594) Lazy calculate live datanodes in safe mode tip

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15594?focusedWorklogId=491103=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491103
 ]

ASF GitHub Bot logged work on HDFS-15594:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:42
Start Date: 25/Sep/20 13:42
Worklog Time Spent: 10m 
  Work Description: NickyYe commented on a change in pull request #2332:
URL: https://github.com/apache/hadoop/pull/2332#discussion_r494624826



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerSafeMode.java
##
@@ -309,16 +311,21 @@ String getSafeModeTip() {
 }
 
 if (datanodeThreshold > 0) {
-  int numLive = blockManager.getDatanodeManager().getNumLiveDataNodes();
-  if (numLive < datanodeThreshold) {
-msg += String.format(
-"The number of live datanodes %d needs an additional %d live "
-+ "datanodes to reach the minimum number %d.%n",
-numLive, (datanodeThreshold - numLive), datanodeThreshold);
+  if (isBlockThresholdMet) {
+int numLive = blockManager.getDatanodeManager().getNumLiveDataNodes();
+if (numLive < datanodeThreshold) {
+  msg += String.format(
+  "The number of live datanodes %d needs an additional %d live "
+  + "datanodes to reach the minimum number %d.%n",
+  numLive, (datanodeThreshold - numLive), datanodeThreshold);
+} else {
+  msg += String.format("The number of live datanodes %d has reached "
+  + "the minimum number %d. ",
+  numLive, datanodeThreshold);
+}
   } else {
-msg += String.format("The number of live datanodes %d has reached "
-+ "the minimum number %d. ",
-numLive, datanodeThreshold);
+msg += "The number of live datanodes is not calculated " +

Review comment:
   fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491103)
Time Spent: 1h 10m  (was: 1h)

> Lazy calculate live datanodes in safe mode tip
> --
>
> Key: HDFS-15594
> URL: https://issues.apache.org/jira/browse/HDFS-15594
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Ye Ni
>Assignee: Ye Ni
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Safe mode tip is printed every 20 seconds.
> This change is to calculate live datanodes until reported block threshold is 
> meet.
>  Old 
> {code:java}
> STATE* Safe mode ON. The reported blocks 111054015 needs additional 27902753 
> blocks to reach the threshold 0.9990 of total blocks 139095856. The number of 
> live datanodes 2531 has reached the minimum number 1. Safe mode will be 
> turned off automatically once the thresholds have been reached.{code}
> New 
> {code:java}
> STATE* Safe mode ON. 
> The reported blocks 134851250 needs additional 3218494 blocks to reach the 
> threshold 0.9990 of total blocks 138207947.
> The number of live datanodes is not calculated since reported blocks hasn't 
> reached the threshold. Safe mode will be turned off automatically once the 
> thresholds have been reached.{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15590) namenode fails to start when ordered snapshot deletion feature is disabled

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15590?focusedWorklogId=491000=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491000
 ]

ASF GitHub Bot logged work on HDFS-15590:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:33
Start Date: 25/Sep/20 13:33
Worklog Time Spent: 10m 
  Work Description: bshashikant merged pull request #2326:
URL: https://github.com/apache/hadoop/pull/2326


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491000)
Time Spent: 1h 40m  (was: 1.5h)

> namenode fails to start when ordered snapshot deletion feature is disabled
> --
>
> Key: HDFS-15590
> URL: https://issues.apache.org/jira/browse/HDFS-15590
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: snapshots
>Reporter: Nilotpal Nandi
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> {code:java}
> 1. Enabled ordered deletion snapshot feature.
> 2. Created snapshottable directory - /user/hrt_6/atrr_dir1
> 3. Created snapshots s0, s1, s2.
> 4. Deleted snapshot s2
> 5. Delete snapshot s0, s1, s2 again
> 6. Disable ordered deletion snapshot feature
> 5. Restart Namenode
> Failed to start namenode.
> org.apache.hadoop.hdfs.protocol.SnapshotException: Cannot delete snapshot s2 
> from path /user/hrt_6/atrr_dir2: the snapshot does not exist.
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:293)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:510)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:819)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:287)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:182)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:912)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:760)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:337)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:755)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:646)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:717)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:960)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:933)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1670)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1737)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15548) Allow configuring DISK/ARCHIVE storage types on same device mount

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15548?focusedWorklogId=491089=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491089
 ]

ASF GitHub Bot logged work on HDFS-15548:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:41
Start Date: 25/Sep/20 13:41
Worklog Time Spent: 10m 
  Work Description: LeonGao91 commented on a change in pull request #2288:
URL: https://github.com/apache/hadoop/pull/2288#discussion_r494599382



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
##
@@ -190,6 +193,26 @@
 }
 this.conf = conf;
 this.fileIoProvider = fileIoProvider;
+this.enableSameDiskArchival =
+conf.getBoolean(DFSConfigKeys.DFS_DATANODE_ALLOW_SAME_DISK_TIERING,
+DFSConfigKeys.DFS_DATANODE_ALLOW_SAME_DISK_TIERING_DEFAULT);
+if (enableSameDiskArchival) {
+  this.mount = usage.getMount();
+  reservedForArchive = conf.getDouble(

Review comment:
   Yeah, it's a good point. The reason I put it this way is to make 
configuration less verbose for normal use cases that datanode only has one type 
of disk. Otherwise, users will need to tag all the disks which is less readable 
and easy to make mistakes.
   
   I think we can introduce additional config for the use case you mentioned 
later, to list out each volume and target ratio.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
##
@@ -412,16 +435,28 @@ long getBlockPoolUsed(String bpid) throws IOException {
*/
   @VisibleForTesting
   public long getCapacity() {
+long capacity;
 if (configuredCapacity < 0L) {
   long remaining;
   if (cachedCapacity > 0L) {
 remaining = cachedCapacity - getReserved();
   } else {
 remaining = usage.getCapacity() - getReserved();
   }
-  return Math.max(remaining, 0L);
+  capacity = Math.max(remaining, 0L);
+} else {
+  capacity = configuredCapacity;
+}
+
+if (enableSameDiskArchival) {

Review comment:
   This is actually the important part to enable this feature, to allow 
users to configure the capacity of a fsVolume.
   
   As we are configuring two fsVolume on the same underlying filesystem, if we 
do nothing the capacity will be calculated twice thus all the stats being 
reported will be incorrect.
   
   Here is an example:
   Let's say we want to configure `[DISK]/data01/dfs` and 
`[ARCHIVE]/data01/dfs_archive` on a 4TB disk mount `/data01`, and we want to 
assign 1 TB to `[DISK]/data01/dfs` and 3 TB for `[ARCHIVE]/data01/dfs_archive`, 
we can make `reservedForArchive` to be 0.75 and put those two dirs in the 
volume list.
   
   In this case, `/data01/dfs` will be reported as a 1TB volume and 
`/data01/dfs_archive` will be reported as 3TB volume to HDFS. Logically, HDFS 
will just treat them as two separate volumes.
   
   If we don't make the change here, HDFS will see two volumes and each of them 
is 4TB, in that case, the 4TB disk will be counted as 4 * 2 = 8TB capacity in 
namenode and all the related stats will be wrong.
   
   Another change we need to make is the `getActualNonDfsUsed()` as below. 
Let's say in the above 4TB disk setup we use 0.1TB as reserved, and 
`[ARCHIVE]/data01/dfs_archive` already has 2TB capacity used, in this case when 
we are calculating the `getActualNonDfsUsed()` for `[DISK]/data01/dfs` it will 
always return 0, which is not correct and it will cause other weird issues. As 
the two fsVolumes are on the same filesystem, the reserved space should be 
shared.
   
   According to our analysis and cluster testing result, updating these two 
functions `getCapacity()` and `getActualNonDfsUsed()` is enough to keep stats 
correct for the two "logical" fsVolumes on same disk.
   
   I can update the java doc to reflect this when the feature is turned on.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
##
@@ -452,7 +487,33 @@ public long getAvailable() throws IOException {
   }
 
   long getActualNonDfsUsed() throws IOException {
-return usage.getUsed() - getDfsUsed();
+// DISK and ARCHIVAL on same disk

Review comment:
   Commented with an example use case as above, hopefully it explains well 
: )

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
##
@@ -62,9 +64,14 @@
   private final VolumeChoosingPolicy blockChooser;
   private final BlockScanner blockScanner;
 
+  private boolean enableSameDiskTiering;

Review comment:
   Good catch! I will update them to be the same name.





This is 

[jira] [Work logged] (HDFS-15025) Applying NVDIMM storage media to HDFS

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15025?focusedWorklogId=491061=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491061
 ]

ASF GitHub Bot logged work on HDFS-15025:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:39
Start Date: 25/Sep/20 13:39
Worklog Time Spent: 10m 
  Work Description: brahmareddybattula commented on a change in pull 
request #2189:
URL: https://github.com/apache/hadoop/pull/2189#discussion_r494037751



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/StorageType.java
##
@@ -34,28 +34,35 @@
 @InterfaceStability.Unstable
 public enum StorageType {
   // sorted by the speed of the storage types, from fast to slow
-  RAM_DISK(true),
-  SSD(false),
-  DISK(false),
-  ARCHIVE(false),
-  PROVIDED(false);
+  RAM_DISK(true, true),
+  NVDIMM(false, true),
+  SSD(false, false),
+  DISK(false, false),
+  ARCHIVE(false, false),
+  PROVIDED(false, false);
 
   private final boolean isTransient;
+  private final boolean isRAM;
 
   public static final StorageType DEFAULT = DISK;
 
   public static final StorageType[] EMPTY_ARRAY = {};
 
   private static final StorageType[] VALUES = values();
 
-  StorageType(boolean isTransient) {
+  StorageType(boolean isTransient, boolean isRAM) {
 this.isTransient = isTransient;
+this.isRAM = isRAM;
   }
 
   public boolean isTransient() {
 return isTransient;
   }
 
+  public boolean isRAM() {
+return isRAM;
+  }

Review comment:
   ok





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491061)
Time Spent: 12h  (was: 11h 50m)

> Applying NVDIMM storage media to HDFS
> -
>
> Key: HDFS-15025
> URL: https://issues.apache.org/jira/browse/HDFS-15025
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: YaYun Wang
>Assignee: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, 
> HDFS-15025.002.patch, HDFS-15025.003.patch, HDFS-15025.004.patch, 
> HDFS-15025.005.patch, HDFS-15025.006.patch, NVDIMM_patch(WIP).patch
>
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> The non-volatile memory NVDIMM is faster than SSD, it can be used 
> simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on 
> NVDIMM can not only improves the response rate of HDFS, but also ensure the 
> reliability of the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15594) Lazy calculate live datanodes in safe mode tip

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15594?focusedWorklogId=491053=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491053
 ]

ASF GitHub Bot logged work on HDFS-15594:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:38
Start Date: 25/Sep/20 13:38
Worklog Time Spent: 10m 
  Work Description: goiri commented on pull request #2332:
URL: https://github.com/apache/hadoop/pull/2332#issuecomment-698636026


   Thanks @NickyYe for the StringBuilder, that will reduce some memory too.
   I'm a little surprised that Yetus is not kicking in.
   @ayushtkn are you familiar on how to kick it? And now that I have your 
attention... any issues you can see with changing the message.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491053)
Time Spent: 1h  (was: 50m)

> Lazy calculate live datanodes in safe mode tip
> --
>
> Key: HDFS-15594
> URL: https://issues.apache.org/jira/browse/HDFS-15594
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Ye Ni
>Assignee: Ye Ni
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Safe mode tip is printed every 20 seconds.
> This change is to calculate live datanodes until reported block threshold is 
> meet.
>  Old 
> {code:java}
> STATE* Safe mode ON. The reported blocks 111054015 needs additional 27902753 
> blocks to reach the threshold 0.9990 of total blocks 139095856. The number of 
> live datanodes 2531 has reached the minimum number 1. Safe mode will be 
> turned off automatically once the thresholds have been reached.{code}
> New 
> {code:java}
> STATE* Safe mode ON. 
> The reported blocks 134851250 needs additional 3218494 blocks to reach the 
> threshold 0.9990 of total blocks 138207947.
> The number of live datanodes is not calculated since reported blocks hasn't 
> reached the threshold. Safe mode will be turned off automatically once the 
> thresholds have been reached.{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15025) Applying NVDIMM storage media to HDFS

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15025?focusedWorklogId=491042=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491042
 ]

ASF GitHub Bot logged work on HDFS-15025:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:37
Start Date: 25/Sep/20 13:37
Worklog Time Spent: 10m 
  Work Description: brahmareddybattula commented on pull request #2189:
URL: https://github.com/apache/hadoop/pull/2189#issuecomment-698212333


   @huangtianhua and @YaYun-Wang  thanks for review. @liuml07  thanks for 
review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491042)
Time Spent: 11h 50m  (was: 11h 40m)

> Applying NVDIMM storage media to HDFS
> -
>
> Key: HDFS-15025
> URL: https://issues.apache.org/jira/browse/HDFS-15025
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: YaYun Wang
>Assignee: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, 
> HDFS-15025.002.patch, HDFS-15025.003.patch, HDFS-15025.004.patch, 
> HDFS-15025.005.patch, HDFS-15025.006.patch, NVDIMM_patch(WIP).patch
>
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> The non-volatile memory NVDIMM is faster than SSD, it can be used 
> simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on 
> NVDIMM can not only improves the response rate of HDFS, but also ensure the 
> reliability of the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15025) Applying NVDIMM storage media to HDFS

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15025?focusedWorklogId=490995=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490995
 ]

ASF GitHub Bot logged work on HDFS-15025:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:33
Start Date: 25/Sep/20 13:33
Worklog Time Spent: 10m 
  Work Description: brahmareddybattula edited a comment on pull request 
#2189:
URL: https://github.com/apache/hadoop/pull/2189#issuecomment-698212333


   @huangtianhua and @YaYun-Wang  thanks for contribution. @liuml07  thanks for 
review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490995)
Time Spent: 11h 40m  (was: 11.5h)

> Applying NVDIMM storage media to HDFS
> -
>
> Key: HDFS-15025
> URL: https://issues.apache.org/jira/browse/HDFS-15025
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: YaYun Wang
>Assignee: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, 
> HDFS-15025.002.patch, HDFS-15025.003.patch, HDFS-15025.004.patch, 
> HDFS-15025.005.patch, HDFS-15025.006.patch, NVDIMM_patch(WIP).patch
>
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> The non-volatile memory NVDIMM is faster than SSD, it can be used 
> simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on 
> NVDIMM can not only improves the response rate of HDFS, but also ensure the 
> reliability of the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15590) namenode fails to start when ordered snapshot deletion feature is disabled

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15590?focusedWorklogId=490959=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490959
 ]

ASF GitHub Bot logged work on HDFS-15590:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:29
Start Date: 25/Sep/20 13:29
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #2326:
URL: https://github.com/apache/hadoop/pull/2326#issuecomment-698199742


   Thanks @szetszwo for the review. I have committed this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490959)
Time Spent: 1.5h  (was: 1h 20m)

> namenode fails to start when ordered snapshot deletion feature is disabled
> --
>
> Key: HDFS-15590
> URL: https://issues.apache.org/jira/browse/HDFS-15590
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: snapshots
>Reporter: Nilotpal Nandi
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> {code:java}
> 1. Enabled ordered deletion snapshot feature.
> 2. Created snapshottable directory - /user/hrt_6/atrr_dir1
> 3. Created snapshots s0, s1, s2.
> 4. Deleted snapshot s2
> 5. Delete snapshot s0, s1, s2 again
> 6. Disable ordered deletion snapshot feature
> 5. Restart Namenode
> Failed to start namenode.
> org.apache.hadoop.hdfs.protocol.SnapshotException: Cannot delete snapshot s2 
> from path /user/hrt_6/atrr_dir2: the snapshot does not exist.
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:293)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:510)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:819)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:287)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:182)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:912)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:760)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:337)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:755)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:646)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:717)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:960)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:933)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1670)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1737)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15596) ViewHDFS#create(f, permission, cflags, bufferSize, replication, blockSize, progress, checksumOpt) should not be restricted to DFS only.

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15596?focusedWorklogId=490870=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490870
 ]

ASF GitHub Bot logged work on HDFS-15596:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:21
Start Date: 25/Sep/20 13:21
Worklog Time Spent: 10m 
  Work Description: umamaheswararao merged pull request #2333:
URL: https://github.com/apache/hadoop/pull/2333


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490870)
Time Spent: 50m  (was: 40m)

> ViewHDFS#create(f, permission, cflags, bufferSize, replication, blockSize, 
> progress, checksumOpt) should not be restricted to DFS only.
> ---
>
> Key: HDFS-15596
> URL: https://issues.apache.org/jira/browse/HDFS-15596
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The ViewHDFS#create(f, permission, cflags, bufferSize, replication, 
> blockSize, progress, checksumOpt) API already available in FileSystem. It 
> will use other overloaded API and finally can go to ViewFileSystem. This case 
> works in regular ViewFileSystem also. With ViewHDFS, we restricted this to 
> DFS only which cause discp to fail when target is non hdfs as it's using this 
> API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15548) Allow configuring DISK/ARCHIVE storage types on same device mount

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15548?focusedWorklogId=490913=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490913
 ]

ASF GitHub Bot logged work on HDFS-15548:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:25
Start Date: 25/Sep/20 13:25
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on a change in pull request #2288:
URL: https://github.com/apache/hadoop/pull/2288#discussion_r494085423



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
##
@@ -412,16 +435,28 @@ long getBlockPoolUsed(String bpid) throws IOException {
*/
   @VisibleForTesting
   public long getCapacity() {
+long capacity;
 if (configuredCapacity < 0L) {
   long remaining;
   if (cachedCapacity > 0L) {
 remaining = cachedCapacity - getReserved();
   } else {
 remaining = usage.getCapacity() - getReserved();
   }
-  return Math.max(remaining, 0L);
+  capacity = Math.max(remaining, 0L);
+} else {
+  capacity = configuredCapacity;
+}
+
+if (enableSameDiskArchival) {

Review comment:
   The return value seems not expected as annotation says if enable this 
feature.
   > the capacity of the file system excluding space reserved for non-HDFS.
   
   IMO, the part for ARCHIVE should also be calculated. It seems be not 
differentiated by NameNode for DISK or ARCHIVE per storage of DataNode. Please 
correct if something wrong.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
##
@@ -452,7 +487,33 @@ public long getAvailable() throws IOException {
   }
 
   long getActualNonDfsUsed() throws IOException {
-return usage.getUsed() - getDfsUsed();
+// DISK and ARCHIVAL on same disk

Review comment:
   same confused as the last comment.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
##
@@ -190,6 +193,26 @@
 }
 this.conf = conf;
 this.fileIoProvider = fileIoProvider;
+this.enableSameDiskArchival =
+conf.getBoolean(DFSConfigKeys.DFS_DATANODE_ALLOW_SAME_DISK_TIERING,
+DFSConfigKeys.DFS_DATANODE_ALLOW_SAME_DISK_TIERING_DEFAULT);
+if (enableSameDiskArchival) {
+  this.mount = usage.getMount();
+  reservedForArchive = conf.getDouble(

Review comment:
   `reservedForArchive` try to define reserve for archive percentage. If 
there are heterogeneous disks located one node, do we need config them separate?

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
##
@@ -62,9 +64,14 @@
   private final VolumeChoosingPolicy blockChooser;
   private final BlockScanner blockScanner;
 
+  private boolean enableSameDiskTiering;

Review comment:
   `enableSameDiskTiering` here vs `enableSameDiskArchival` at 
FsVolumeImpl,  we should unified variable name.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490913)
Time Spent: 5h 10m  (was: 5h)

> Allow configuring DISK/ARCHIVE storage types on same device mount
> -
>
> Key: HDFS-15548
> URL: https://issues.apache.org/jira/browse/HDFS-15548
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Leon Gao
>Assignee: Leon Gao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> We can allow configuring DISK/ARCHIVE storage types on the same device mount 
> on two separate directories.
> Users should be able to configure the capacity for each. Also, the datanode 
> usage report should report stats correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15596) ViewHDFS#create(f, permission, cflags, bufferSize, replication, blockSize, progress, checksumOpt) should not be restricted to DFS only.

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15596?focusedWorklogId=490904=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490904
 ]

ASF GitHub Bot logged work on HDFS-15596:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:24
Start Date: 25/Sep/20 13:24
Worklog Time Spent: 10m 
  Work Description: umamaheswararao commented on pull request #2333:
URL: https://github.com/apache/hadoop/pull/2333#issuecomment-698367707


   Jenkins Run:
   
https://issues.apache.org/jira/browse/HDFS-15596?focusedCommentId=17201381=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17201381
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490904)
Time Spent: 1h  (was: 50m)

> ViewHDFS#create(f, permission, cflags, bufferSize, replication, blockSize, 
> progress, checksumOpt) should not be restricted to DFS only.
> ---
>
> Key: HDFS-15596
> URL: https://issues.apache.org/jira/browse/HDFS-15596
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The ViewHDFS#create(f, permission, cflags, bufferSize, replication, 
> blockSize, progress, checksumOpt) API already available in FileSystem. It 
> will use other overloaded API and finally can go to ViewFileSystem. This case 
> works in regular ViewFileSystem also. With ViewHDFS, we restricted this to 
> DFS only which cause discp to fail when target is non hdfs as it's using this 
> API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15596) ViewHDFS#create(f, permission, cflags, bufferSize, replication, blockSize, progress, checksumOpt) should not be restricted to DFS only.

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15596?focusedWorklogId=490799=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490799
 ]

ASF GitHub Bot logged work on HDFS-15596:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:13
Start Date: 25/Sep/20 13:13
Worklog Time Spent: 10m 
  Work Description: umamaheswararao opened a new pull request #2333:
URL: https://github.com/apache/hadoop/pull/2333


   https://issues.apache.org/jira/browse/HDFS-15596



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490799)
Time Spent: 40m  (was: 0.5h)

> ViewHDFS#create(f, permission, cflags, bufferSize, replication, blockSize, 
> progress, checksumOpt) should not be restricted to DFS only.
> ---
>
> Key: HDFS-15596
> URL: https://issues.apache.org/jira/browse/HDFS-15596
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The ViewHDFS#create(f, permission, cflags, bufferSize, replication, 
> blockSize, progress, checksumOpt) API already available in FileSystem. It 
> will use other overloaded API and finally can go to ViewFileSystem. This case 
> works in regular ViewFileSystem also. With ViewHDFS, we restricted this to 
> DFS only which cause discp to fail when target is non hdfs as it's using this 
> API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15594) Lazy calculate live datanodes in safe mode tip

2020-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15594?focusedWorklogId=490836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490836
 ]

ASF GitHub Bot logged work on HDFS-15594:
-

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:17
Start Date: 25/Sep/20 13:17
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2332:
URL: https://github.com/apache/hadoop/pull/2332#discussion_r494462104



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerSafeMode.java
##
@@ -309,16 +311,21 @@ String getSafeModeTip() {
 }
 
 if (datanodeThreshold > 0) {
-  int numLive = blockManager.getDatanodeManager().getNumLiveDataNodes();
-  if (numLive < datanodeThreshold) {
-msg += String.format(
-"The number of live datanodes %d needs an additional %d live "
-+ "datanodes to reach the minimum number %d.%n",
-numLive, (datanodeThreshold - numLive), datanodeThreshold);
+  if (isBlockThresholdMet) {
+int numLive = blockManager.getDatanodeManager().getNumLiveDataNodes();
+if (numLive < datanodeThreshold) {
+  msg += String.format(
+  "The number of live datanodes %d needs an additional %d live "
+  + "datanodes to reach the minimum number %d.%n",
+  numLive, (datanodeThreshold - numLive), datanodeThreshold);
+} else {
+  msg += String.format("The number of live datanodes %d has reached "
+  + "the minimum number %d. ",
+  numLive, datanodeThreshold);
+}
   } else {
-msg += String.format("The number of live datanodes %d has reached "
-+ "the minimum number %d. ",
-numLive, datanodeThreshold);
+msg += "The number of live datanodes is not calculated " +

Review comment:
   As we are at it, does it make sense to use StringBuilder?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490836)
Time Spent: 50m  (was: 40m)

> Lazy calculate live datanodes in safe mode tip
> --
>
> Key: HDFS-15594
> URL: https://issues.apache.org/jira/browse/HDFS-15594
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Ye Ni
>Assignee: Ye Ni
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Safe mode tip is printed every 20 seconds.
> This change is to calculate live datanodes until reported block threshold is 
> meet.
>  Old 
> {code:java}
> STATE* Safe mode ON. The reported blocks 111054015 needs additional 27902753 
> blocks to reach the threshold 0.9990 of total blocks 139095856. The number of 
> live datanodes 2531 has reached the minimum number 1. Safe mode will be 
> turned off automatically once the thresholds have been reached.{code}
> New 
> {code:java}
> STATE* Safe mode ON. 
> The reported blocks 134851250 needs additional 3218494 blocks to reach the 
> threshold 0.9990 of total blocks 138207947.
> The number of live datanodes is not calculated since reported blocks hasn't 
> reached the threshold. Safe mode will be turned off automatically once the 
> thresholds have been reached.{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org