[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-10-01 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942401#comment-16942401
 ] 

Wei-Chiu Chuang commented on HDFS-14313:


Conflicts are trivial so I pushed them into branch-3.2 and branch-3.1. Patch 
attached to the jira for posterity.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-14313-branch-2.v1.patch, 
> HDFS-14313-branch-2.v2.patch, HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch, HDFS-14313.branch-3.0.v1.patch, 
> HDFS-14313.branch-3.0.v2.patch, HDFS-14313.branch-3.1.patch, 
> HDFS-14313.branch-3.2.patch, HDFS-14313.branch-3.v1.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-10-01 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942389#comment-16942389
 ] 

Wei-Chiu Chuang commented on HDFS-14313:


This is missing from branch-3.2 and branch-3.1 unfortunately ...

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 2.10.0, 3.0.4, 3.3.0
>
> Attachments: HDFS-14313-branch-2.v1.patch, 
> HDFS-14313-branch-2.v2.patch, HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch, HDFS-14313.branch-3.0.v1.patch, 
> HDFS-14313.branch-3.0.v2.patch, HDFS-14313.branch-3.v1.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-07 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16902154#comment-16902154
 ] 

Lisheng Sun commented on HDFS-14313:


the branch-2.v2 for patch fixes unchecked issue in javac. It can be ignored.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14313-branch-2.v1.patch, 
> HDFS-14313-branch-2.v2.patch, HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch, HDFS-14313.branch-3.0.v1.patch, 
> HDFS-14313.branch-3.0.v2.patch, HDFS-14313.branch-3.v1.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16902141#comment-16902141
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HDFS-14313 does not apply to branch-2. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976946/HDFS-14313-branch-2.v2.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27440/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14313-branch-2.v1.patch, 
> HDFS-14313-branch-2.v2.patch, HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch, HDFS-14313.branch-3.0.v1.patch, 
> HDFS-14313.branch-3.0.v2.patch, HDFS-14313.branch-3.v1.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16902024#comment-16902024
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
5s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
10s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
47s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
22s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
45s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
13s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
41s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
29s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
52s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 11m 52s{color} 
| {color:red} root-jdk1.7.0_95 with JDK v1.7.0_95 generated 3 new + 1441 
unchanged - 3 fixed = 1444 total (was 1444) {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
31s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 10m 31s{color} 
| {color:red} root-jdk1.8.0_222 with JDK v1.8.0_222 generated 4 new + 1342 
unchanged - 4 fixed = 1346 total (was 1346) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 44s{color} | {color:orange} root: The patch generated 3 new + 324 unchanged 
- 1 fixed = 327 total (was 325) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
49s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16902009#comment-16902009
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m  
7s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
37s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
21s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
56s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
17s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
40s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
54s{color} | {color:green} branch-3.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
19s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 59s{color} | {color:orange} root: The patch generated 3 new + 296 unchanged 
- 1 fixed = 299 total (was 297) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m  8s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m  9s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}191m 14s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.util.TestReadWriteDiskValidator |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:e402791 |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976909/HDFS-14313.branch-3.0.v2.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux f9b46a824269 4.4.0-157-generic #185-Ubuntu SMP Tue Jul 23 
09:17:01 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901928#comment-16901928
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
14s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
45s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
52s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  9m 
24s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
21m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
16s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
19s{color} | {color:green} branch-3.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m  
4s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 59s{color} | {color:orange} root: The patch generated 3 new + 296 unchanged 
- 1 fixed = 299 total (was 297) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 10m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 58s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 27s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}155m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ha.TestZKFailoverController |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:e402791 |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976897/HDFS-14313.branch-3.0.v1.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-07 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901783#comment-16901783
 ] 

Lisheng Sun commented on HDFS-14313:


Thank [~linyiqun] for your suggestion. I rename patch for branch-3.x and upload 
the branch-3.0.v1 patch.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch, HDFS-14313.branch-3.0.v1.patch, 
> HDFS-14313.branch-3.v1.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-07 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901781#comment-16901781
 ] 

Yiqun Lin commented on HDFS-14313:
--

I don't mean you need to patch for all branches, actually two patches are 
enough, branch-3 and branch-2.

Since branch-3 seems not be updated, so I suggest you to patch for branch-3.0 
instead of. So if you attach the patch for branch-2, please use the name 
HDFS-14313-branch-2.v1.patch.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch, HDFS-14313.branch-3.v1.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-07 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901778#comment-16901778
 ] 

Lisheng Sun commented on HDFS-14313:


I attached patch for branch-3.  you hope I attach patch for branch-3.0, 
branch-3.1, branch3.2 and branch2.9. Instead of branch-3 and branch-2? I think 
I was misunderstanding.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch, HDFS-14313.branch-3.v1.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-07 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901765#comment-16901765
 ] 

Yiqun Lin commented on HDFS-14313:
--

[~leosun08], you need to rename your branch patch to a correct name. Here we 
need to contain the branch name, e.g. {{HDFS-14313-BRANCH_NAME.patch}}. So if 
we want to patch for branch-3.0, the patch name should be 
HDFS-14313-branch-3.0.v1.patch rather than HDFS-14313.branch-3.v1.patch. The 
BRANCH_NAME will be detected and trigger the Jenkins for corresponding branch.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch, HDFS-14313.branch-3.v1.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901758#comment-16901758
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 13m 
56s{color} | {color:red} Docker failed to build yetus/hadoop:20ca6773f1e. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976891/HDFS-14313.branch-3.v1.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27429/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch, HDFS-14313.branch-3.v1.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901736#comment-16901736
 ] 

Lisheng Sun commented on HDFS-14313:


Thanx [~linyiqun] for all your work for this patch. I will attach the patch for 
branch-3.x and branch-2.x branches later. 

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901635#comment-16901635
 ] 

Hudson commented on HDFS-14313:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17055 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17055/])
HDFS-14313. Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo 
(yqlin: rev a5bb1e8ee871dfff77d0f6921b13c8ffb50e)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSCachingGetSpaceUsed.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/ReplicaCachingGetSpaceUsed.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GetSpaceUsed.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestReplicaCachingGetSpaceUsed.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* (edit) hadoop-common-project/hadoop-common/src/main/resources/core-default.xml


> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901632#comment-16901632
 ] 

Yiqun Lin commented on HDFS-14313:
--

I have committed this to trunk, but find the conflicts when backport to 
branch-3.x and branch-2.x branches. [~leosun08], would you mind attach the 
patch for those branches? I think this improvement is very helpful and can be 
backported in these versions.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901628#comment-16901628
 ] 

Yiqun Lin commented on HDFS-14313:
--

LGTM, +1. Committing this.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901393#comment-16901393
 ] 

Wei-Chiu Chuang commented on HDFS-14313:


+1

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901388#comment-16901388
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
44s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 10s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m  1s{color} | {color:orange} root: The patch generated 3 new + 284 unchanged 
- 1 fixed = 287 total (was 285) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
23s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 23s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
54s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}191m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestLargeBlockReport |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976822/HDFS-14313.014.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 21b0e4638ad6 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901146#comment-16901146
 ] 

Lisheng Sun commented on HDFS-14313:


Fix UT in hadoop.conf.TestCommonConfigurationFields and other UTs fails is 
unrelated to this patch.  

Upload the v014 patch. Please help review it. Thank [~linyiqun].

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, 
> HDFS-14313.014.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901103#comment-16901103
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
0s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
0s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
11s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
48s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 16s{color} | {color:orange} root: The patch generated 1 new + 171 unchanged 
- 1 fixed = 172 total (was 172) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 22s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 59s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}110m 48s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
53s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}228m 25s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.conf.TestCommonConfigurationFields |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.server.datanode.TestLargeBlockReport |
|   | hadoop.hdfs.server.balancer.TestBalancerRPCDelay |
|   | hadoop.hdfs.TestDecommission |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976803/HDFS-14313.013.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 653814de05ce 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900863#comment-16900863
 ] 

Lisheng Sun commented on HDFS-14313:


Thank [~linyiqun] for your reminder. I rebased the code, updated description 
about config key "fs.getspaceused.classname" in core-default.xml and uploaded 
the v013 patch. 

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900849#comment-16900849
 ] 

Yiqun Lin commented on HDFS-14313:
--

[~leosun08], can you rebase the code? I applied the v012 patch in my local, it 
showed that the core-default.xml cannot be applied. You may git pull the latest 
code and then generate the new patch file.
{noformat}
error: patch failed: 
hadoop-common-project/hadoop-common/src/main/resources/core-default.xml:3494
error: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml: 
patch does not apply
{noformat}
 Also can you update the description to following? From the original design, 
GetSpaceUsed is not only used for telling HDFS space used.
{noformat}
   
  The class that can tell estimate much space is used in a directory.
  There are four impl classes that being supported: 
  org.apache.hadoop.fs.DU(default), org.apache.hadoop.fs.WindowsGetSpaceUsed
  org.apache.hadoop.fs.DFCachingGetSpaceUsed and
  
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.
  And the ReplicaCachingGetSpaceUsed impl class only used in HDFS module.

{noformat}

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900753#comment-16900753
 ] 

Lisheng Sun commented on HDFS-14313:


{quote}
HDFS-14313 does not apply to trunk. Rebase required? Wrong Branch? See 
https://wiki.apache.org/hadoop/HowToContribute for help.
{quote}
That I only update core-default.xml leads to this problem.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900748#comment-16900748
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HDFS-14313 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976792/HDFS-14313.012.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27413/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900724#comment-16900724
 ] 

Lisheng Sun commented on HDFS-14313:


Thanx [~linyiqun] [~jojochuang] for your good suggestions.  I added the key 
fs.getspaceused.classname and the usage of these four impl classes in 
core-default.xml. 

I have uloaded the v12 patch. Thank you again.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch, HDFS-14313.012.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-06 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900699#comment-16900699
 ] 

Yiqun Lin commented on HDFS-14313:
--

[~leosun08],  at least, I think add the key fs.getspaceused.classname in 
core-default.xml should be okay as I see this key is missing for a long time.
{quote}That ReplicaCachingGetSpaceUsed of hdfs module is added in 
core-default.xml of common module should not be very good
{quote}
As [~jojochuang] mentioned, we can document the usage of these four impl 
classes in config's description to let users know. It will be okay I think.

Except this, others is ready to go, :). This will be a very good improvement.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-05 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900563#comment-16900563
 ] 

Lisheng Sun commented on HDFS-14313:


Thank [~jojochuang] for your suggestion.
  
{quote}One thing that's missing out in the v11 is the update to 
hdfs-default.xml which was missing since v8.
{quote}

 According to [~linyiqun] suggetion, I define a hard-coded threadold time value 
like 1000ms in ReplicaCachingGetSpaceUsed. So remove config in hdfs-default.xml.
{code:java}
private static final long DEEP_COPY_REPLICA_THRESHOLD_MS = 50;
private static final long REPLICA_CACHING_GET_SPACE_USED_THRESHOLD_MS = 1000;
{code}

{quote}
Additionally there should be an additional config key 
"fs.getspaceused.classname" in core-default.xml, and state that possible 
options are

org.apache.hadoop.fs.DU (default)
org.apache.hadoop.fs.WindowsGetSpaceUsed
org.apache.hadoop.fs.DFCachingGetSpaceUsed
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed
{quote}

That ReplicaCachingGetSpaceUsed of hdfs module is added in core-default.xml of 
common module should not be very good. Please correct me if I was wrong. Thank 
you again.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-05 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900561#comment-16900561
 ] 

Yiqun Lin commented on HDFS-14313:
--

{quote}
One thing that's missing out in the v11 is the update to hdfs-default.xml which 
was missing since v8.
{quote}
Not fully get this point. The print time threshold key is no longer needed now. 
Does this mean we need to add fs.getspaceused.classname key in hdfs-default as 
well?

Hi [~leosun08], can you additionally address the [~jojochuang]'s comment?

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-05 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900549#comment-16900549
 ] 

Wei-Chiu Chuang commented on HDFS-14313:


Hi [~leosun08] thanks a lot for continued work here, and [~linyiqun] for 
continued review while I dropped out.

I think v11 patch is good. [~linyiqun] if you are ok feel free to commit.

 

One thing that's missing out in the v11 is the update to hdfs-default.xml which 
was missing since v8.

Additionally there should be an additional config key 
"fs.getspaceused.classname" in core-default.xml, and state that possible 
options are
 # org.apache.hadoop.fs.DU (default)
 # org.apache.hadoop.fs.WindowsGetSpaceUsed
 # org.apache.hadoop.fs.DFCachingGetSpaceUsed
 # 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed

 

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-05 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900515#comment-16900515
 ] 

Lisheng Sun commented on HDFS-14313:


Thanx  [~linyiqun] for your deep review. I updated this patch as your 
comments. 

checkstyle issue:
{code:java}
./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSCachingGetSpaceUsed.java:64:
public Builder setBpid(String bpid) {:35: 'bpid' hides a field. 
[HiddenField]
{code}
I think it is not a problem. I haved uploaded the v11 patch. Could you help 
review it?  Thank you. 

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, 
> HDFS-14313.011.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-05 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900239#comment-16900239
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 23m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
21m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
45s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 20m 
58s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 59s{color} | {color:orange} root: The patch generated 1 new + 171 unchanged 
- 1 fixed = 172 total (was 172) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m  
8s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}131m 40s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 4s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}274m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestFileChecksumCompositeCrc |
|   | hadoop.hdfs.server.namenode.TestFSImageWithXAttr |
|   | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
|   | hadoop.hdfs.server.datanode.TestLargeBlockReport |
|   | hadoop.hdfs.TestErasureCodingMultipleRacks |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.TestFileAppend3 |
|   | hadoop.hdfs.client.impl.TestBlockReaderLocal |
|   | hadoop.hdfs.server.mover.TestMover |
|   | hadoop.hdfs.TestErasureCodingPolicies |
|   | hadoop.hdfs.TestDFSFinalize |
|   | hadoop.hdfs.server.mover.TestStorageMover |
|   | hadoop.hdfs.TestDFSPermission |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-05 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899932#comment-16899932
 ] 

Yiqun Lin commented on HDFS-14313:
--

Two review comments:

I'd like to decorate the javadoc comment to a more readable way.
 From
{noformat}
Fast and accurate class to tell how much space HDFS is using. This class get
hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory for HDFS.

Get hdfs used space by ReplicaCachingGetSpaceUsed impl only includes block
and meta, but get space used by DU impl includes include all directories
size, other files such as VERSION, in_use.lock and so on. Get space used by
DU impl must be greater than by ReplicaCachingGetSpaceUsed impl. Get space
used by ReplicaCachingGetSpaceUsed impl is more accurate.

setting fs.getspaceused.classname to
org.apache.hadoop.hdfs.server.datanode.fsdataset
impl.ReplicaCachingGetSpaceUsed in your core-site.xml if we want to enable
this class.
{noformat}
To 
{noformat}
Fast and accurate class to tell how much space HDFS is using. This class gets
hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos that uses an in
memory way.

Getting hdfs used space by ReplicaCachingGetSpaceUsed impl only includes block
and meta files, but DU impl is blockpool dir based statistics that will
include additional files, e.g. tmp dir, scanner.cursor file. Getting space used 
by
DU impl will be greater than by ReplicaCachingGetSpaceUsed impl, but the latter 
is more accurate.

Setting fs.getspaceused.classname to
org.apache.hadoop.hdfs.server.datanode.fsdataset
impl.ReplicaCachingGetSpaceUsed in your core-site.xml if we want to enable
this class.
{noformat}

{noformat}
os.close();
fs.delete(new Path("/testReplicaCachingGetSpaceUsedByRBWReplica"), true);
{noformat}
Can we add the asset operation again after close operation?  After close 
operation, the replica state will be transformed from RBW to finalized. But the 
space used of these replicas  are all included and the dfsUsed value should be 
same.
{noformat}
os.close();
assertEquals(blockLength + metaLength, dataNode.getFSDataset().getDfsUsed());
fs.delete(new Path("/testReplicaCachingGetSpaceUsedByRBWReplica"), true);
{noformat}

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-03 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899565#comment-16899565
 ] 

Lisheng Sun commented on HDFS-14313:


ping [~linyiqun] Could you mind taking a review for this patch? Thank you.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16898122#comment-16898122
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
7s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 47s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m  4s{color} | {color:orange} root: The patch generated 3 new + 171 unchanged 
- 1 fixed = 174 total (was 172) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 29s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
45s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 37s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
55s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}189m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.server.datanode.TestLargeBlockReport |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976426/HDFS-14313.010.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 3084f477ad20 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a7371a7 |
| maven | version: Apache Maven 3.3.9 |
| Default 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-01 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897988#comment-16897988
 ] 

Lisheng Sun commented on HDFS-14313:


Thanx for [~linyiqun] for deep review.  I have updated this patch as your 
comments and uploaded the v10 patch.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-08-01 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897804#comment-16897804
 ] 

Yiqun Lin commented on HDFS-14313:
--

Almost loogs good now, some minor comments from me:
{noformat}
The deepCopyReplica call does't use the datasetock s
{noformat}
One typo , does't --> doesn't
{noformat}
setting set fs.getspaceused.classname
{noformat}
please remove the redundant set and update setting to Setting.
{noformat}
"blockPoolId: {}, replicas size: {}, copy replicas duration: {}ms"
{noformat}
Can you update to
{noformat}
Copy replica infos, blockPoolId: {}, replicas size: {}, duration: {}ms"
{noformat}
Update refresh to Refresh.
{noformat}
fs.getClient().delete("/testReplicaCachingGetSpaceUsed", true);
{noformat}
We can directly call the filesystem api to delete file.
{noformat}
fs.delete(new Path("/testReplicaCachingGetSpaceUsed"), true);
{noformat}
{quote}Get space used by DU impl must be greater than by 
ReplicaCachingGetSpaceUsed impl. get space used by ReplicaCachingGetSpaceUsed 
impl is more accurate. so is it necessary to add comparison for the DU impl 
class?
{quote}
You have raised up a good point the ReplicaCachingGetSpaceUsed way will only 
calculate the finalized blocks while du command way includes more files. Can 
you comment this important difference in the javadoc comment of class 
ReplicaCachingGetSpaceUsed? We should let others know which files this class 
will calculate for.

Yes, the calculation way is different now. Can you add an additional test to 
test with the case that some blocks files are not finalized, for example being 
rbw state? And then we check if the dfsused is correctly be updated.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-30 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896696#comment-16896696
 ] 

Lisheng Sun commented on HDFS-14313:


hi [~linyiqun]  Could you have time to continue review ? Thank you.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894989#comment-16894989
 ] 

Hadoop QA commented on HDFS-14313:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
11s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
16s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 16s{color} | {color:orange} root: The patch generated 3 new + 171 unchanged 
- 1 fixed = 174 total (was 172) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 22s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
17s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 78m 
51s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
57s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}191m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976087/HDFS-14313.009.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c7be60022bf8 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 02bd02b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27325/artifact/out/diff-checkstyle-root.txt
 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-28 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894898#comment-16894898
 ] 

Lisheng Sun commented on HDFS-14313:


Updated patch according the review comments and uploaded the v9 patch.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch, HDFS-14313.009.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-28 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894893#comment-16894893
 ] 

Lisheng Sun commented on HDFS-14313:


Thanx [~linyiqun] for your review.

*FSCachingGetSpaceUsed*
{quote}Line 53: Add the final keyword for the FsVolumeImpl variable.
 Line 54: Add the final keyword too.
{quote}
The value of volume and bpid is assignment by set*, so don't add the final 
keyword for these two variable.
{quote}Line 75: We don't pass the config to use the threshold time now, still 
we need to override this method? If don't need, the change made in
 GetSpaceUsed can also be reverted.
{quote}
Add the new variable volume and bpid of FSCachingGetSpaceUsed for HDFS module,so
 ReplicaCachingGetSpaceUsed‘s Constructor parameter must be 
FSCachingGetSpaceUsed#Builder and don't remove FSCachingGetSpaceUsed#build().
 * TestReplicaCachingGetSpaceUsed*
{quote}Line 69: As I have mentioned before, can we have an additional 
comparison for the DU impl class? The most of lines can be reused for these two 
getused impl class. Just passing different key value with restart the mini 
cluster and comparing the used space.
{quote}

 get space used by DU impl include

├── current
│   ├── BP-1876464514-10.239.56.179-1564369203299
│   │   ├── current
│   │   │   ├── VERSION
│   │   │   ├── finalized
│   │   │   │   └── subdir0
│   │   │   │   └── subdir0
│   │   │   │   ├── blk_1073741825
│   │   │   │   └── blk_1073741825_1001.meta
│   │   │   └── rbw
│   │   ├── scanner.cursor
│   │   └── tmp
│   └── VERSION
└── in_use.lock

get space used by ReplicaCachingGetSpaceUsed impl include

├── blk_1073741825
└── blk_1073741825_1001.meta

 get space used by DU impl include all directories size, other files such as 
VERSION, in_use.lock and so on

Get space used by DU impl must be greater than by ReplicaCachingGetSpaceUsed 
impl. get space used by ReplicaCachingGetSpaceUsed impl is more accurate. so is 
it necessary to add comparison for the DU impl class? 

Please correct me if I was wrong. Thank [~linyiqun] again.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-27 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894619#comment-16894619
 ] 

Yiqun Lin commented on HDFS-14313:
--

Thanks for updating the patch, [~leosun08]. Some review comments for the patch:

*FSCachingGetSpaceUsed*
 Line 35: Can we update the annotation to {{@InterfaceAudience.Private}} since 
this is only specified for HDFS module.?
 Line 38: Remove the public keyword for LOG instance.
 Line 53: Add the final keyword for the FsVolumeImpl variable.
 Line 54: Add the final keyword too.
 Line 75: We don't pass the config to use the threshold time now, still we need 
to override this method? If don't need, the change made in
 GetSpaceUsed can also be reverted.

*FsDatasetImpl*
{noformat}
FsVolumeList#addBlockPool -> FsVolumeImpl#addBlockPool -> new BlockPoolSlice -> 
FsDatasetImpl#deepCopyReplica. If deepCopyReplica use datasetock, it appears 
deadlock.
{noformat}
This comment can simplified to 'The deepCopyReplica call does't use the 
datasetock since it will lead the potential deadlock with the

{@link FsVolumeList#addBlockPool} call.'

*ReplicaCachingGetSpaceUsed*
 Line 41:
{noformat}
To use set fs.getspaceused.classname to 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed
 in your core-site.xml.
{noformat}
can update to a more readable way:
{noformat}
Setting fs.getspaceused.classname to 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed
 in core-site.xml if we want to enable this class.
{noformat}
Line 46: Update the annotation to {{@InterfaceAudience.Private}}.
 Line 49: Remove the public keyword.
 Line 51: Add the final keyword.
 Line 52: Add the final keyword.
 Line76: I would prefer to define a static final variable to hard-coded the 
value in ReplicaCachingGetSpaceUsed.
 Line 77: Can we use the parameter way to print the log? Like 
LOG.debug("BlockPoolId: {}, replicas size: {}, copy replicas duration: {}ms.", 
obj1, obj2...}
 Line 95: The same comment for Line 76.
 The 96:The same comment for Line 77.
 Line 101: Update replicaCachingGetSpaceUsed to ReplicaCachingGetSpaceUsed.

*TestReplicaCachingGetSpaceUsed*
 Line 42: Please add the definition comment for this class.
 Line 51: We can use Configuration#setClass call to set the class impl
{noformat}
conf.setClass("fs.getspaceused.classname", ReplicaCachingGetSpaceUsed.class, 
CachingGetSpaceUsed.class);
{noformat}
Line 61: It will be a good behavior to clean the hdfs path we created for the 
test.
 Line 69: As I have mentioned before, can we have an additional comparison for 
the DU impl class? The most of lines can be reused for these two getused impl 
class. Just passing different key value with restart the mini cluster and 
comparing the used space.

 

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-27 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894465#comment-16894465
 ] 

Lisheng Sun commented on HDFS-14313:


hi [~linyiqun] [~jojochuang] Could you have time to review this patch? Thank 
you a lot.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893159#comment-16893159
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  8m 
50s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 41s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
5s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 16s{color} | {color:orange} root: The patch generated 4 new + 172 unchanged 
- 1 fixed = 176 total (was 173) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
18s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 34s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
49s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}201m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.0 Server=19.03.0 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12975834/HDFS-14313.008.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 6798be829d86 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-25 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892921#comment-16892921
 ] 

Lisheng Sun commented on HDFS-14313:


Thank [~linyiqun] for your review. I have update the patch as your comments. 
Could you help review it? Thank you.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, 
> HDFS-14313.008.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-23 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891132#comment-16891132
 ] 

Yiqun Lin commented on HDFS-14313:
--

Hi [~leosun08],
1.
{quote}
I don't use the component specific impl class in the common module. to update 
in GetSpaceUsed of common model is just to make subclasses inheritable. And 
that to update in CommonConfigurationKeys of common model is to print threshold 
time,which should be moved to DFSConfigKeys and is more appropriate..
{quote}
I don't think it's necessary to have two new config to make the threshold time 
configurable, the hard-coded way should be enough. I mean we can define a 
hard-coded threadold time value like 1000ms in ReplicaCachingGetSpaceUsed. So 
that we don't need to do any change in Common.

2. Can you add comment for the potential dead lock issue in the method 
deepCopyReplica? That will be let others know the context. 

Besides above two points, others make sense to me.


> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-22 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890237#comment-16890237
 ] 

Lisheng Sun commented on HDFS-14313:


hi [~linyiqun], could you mind continuing to reiview this patch? Thank you.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-21 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16889667#comment-16889667
 ] 

Lisheng Sun commented on HDFS-14313:


Thank [~linyiqun] for your review carefully. I have a few questions to discuss 
with you.

1 I don't use the component specific impl class in the common module. to update 
in GetSpaceUsed of common model is just to make subclasses inheritable. And 
that to update in CommonConfigurationKeys of common model is to print threshold 
time,which should be moved to DFSConfigKeys and is more appropriate.

2. Now there are switches that are used in control. follow as 
GetSpaceUsed#Builder#CLASSNAME_KEY
{code:java}
// static final String CLASSNAME_KEY = "fs.getspaceused.classname";
{code}
if add enableFSCachingGetSpace as you say, there are two switches that give the 
user more confusion.

3.
{quote}Even though deepCopyReplica is only used by another thread, I still 
prefer to let it be an atomic operation incase this will be used in other 
places in the future. Can you add datasetock here?
{quote}
FsDatasetImpl#addBlockPool with datasetLock ->FsVolumeList#addBlockPool
{code:java}
@Override
public void addBlockPool(String bpid, Configuration conf)
throws IOException {
  LOG.info("Adding block pool " + bpid);
  try (AutoCloseableLock lock = datasetLock.acquire()) {
volumes.addBlockPool(bpid, conf);
volumeMap.initBlockPool(bpid);
  }
  volumes.getAllVolumesMap(bpid, volumeMap, ramDiskReplicaTracker);
}
{code}
FsVolumeList#addBlockPool ->FsVolumeImpl#addBlockPool -> new BlockPoolSlice 
->FsDatasetImpl#deepCopyReplica. If deepCopyReplica use datasetock, it appears 
deadlock. So use Collections.unmodifiableSet to make replica info is not 
allowed to be modified outside
{code:java}
void addBlockPool(final String bpid, final Configuration conf) throws 
IOException {
  long totalStartTime = Time.monotonicNow();
  final Map unhealthyDataDirs =
  new ConcurrentHashMap();
  List blockPoolAddingThreads = new ArrayList();
  for (final FsVolumeImpl v : volumes) {
Thread t = new Thread() {
  public void run() {
try (FsVolumeReference ref = v.obtainReference()) {
  FsDatasetImpl.LOG.info("Scanning block pool " + bpid +
  " on volume " + v + "...");
  long startTime = Time.monotonicNow();
  v.addBlockPool(bpid, conf);
  long timeTaken = Time.monotonicNow() - startTime;
  FsDatasetImpl.LOG.info("Time taken to scan block pool " + bpid +
  " on " + v + ": " + timeTaken + "ms");
} catch (ClosedChannelException e) {
  // ignore.
} catch (IOException ioe) {
  FsDatasetImpl.LOG.info("Caught exception while scanning " + v +
  ". Will throw later.", ioe);
  unhealthyDataDirs.put(v, ioe);
}
  }
};
blockPoolAddingThreads.add(t);
t.start();
  }
{code}
 4. According to your suggestion, I will modify UT for using  real minicluster 
and adding a comparison test by respectively using FSCachingGetUsed and default 
Du way.

Please correct me if I was wrong. Thank [~linyiqun] again.

 

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-19 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888592#comment-16888592
 ] 

Yiqun Lin commented on HDFS-14313:
--

Take a deep review for the patch, this FSCachingGetSpaceUsed should be the HDFS 
specific impl class. I prefer not to make any change in the Common module. We 
can do this extend like HDFS DFSNetworkTopology did, add new config to enable 
or disable this. The instance will be got like this:
{code:java}
 if(enableFSCachingGetSpace) {
 this.dfsUsage = new CachingGetSpaceUsed.Builder().setPath(bpDir)
.setConf(conf)
 
.setInitialUsed(loadDfsUsed())
 .build();
 } else {
this.dfsUsage = new FSCachingGetSpaceUsed.Builder().setBpid(bpid)
.setVolume(volume)
.setPath(bpDir)
.setConf(conf)
.setInitialUsed(loadDfsUsed())
.build();
}
{code}
{code:java}
+  @Override
+  public Set deepCopyReplica(String bpid)
+  throws IOException {
+Set replicas =
+new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET
+: volumeMap.replicas(bpid));
+return Collections.unmodifiableSet(replicas);
+  }
{code}
Even though deepCopyReplica is only used by another thread, I still prefer to 
let it be an atomic operation incase this will be used in other places in the 
future. Can you add datasetock here?
{noformat}
fs.getspaceused.deep.copy.replica.threshold.ms
fs.getspaceused.replica.caching.threshold.ms
{noformat}
I don't think above two configs are really needed once we make this as the HDFS 
specific getused class. Instead of, we can still keep this using a hard-coded 
way.

For the unit test {{TestReplicaCachingGetSpaceUsed}}, why not use a more real 
way to test this? I prefer to start up the real minicluster then write the 
actual blocks files and do the verification. Mocking way needs many steps to 
setup the DN and also not convenient to cover all the cases.

Also we need a comparison test by respectively using FSCachingGetUsed and 
default Du way, to check if the dfsUsed calculated is same. 

[~leosun08], I suppose you may need to do a refactor for the latest patch.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888028#comment-16888028
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
3s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
8s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
32s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 11s{color} | {color:orange} root: The patch generated 3 new + 245 unchanged 
- 1 fixed = 248 total (was 246) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 57s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
23s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 39s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}217m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.8 Server=18.09.8 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12975148/HDFS-14313.007.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888023#comment-16888023
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 38s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
59s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
41s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 24s{color} | {color:orange} root: The patch generated 3 new + 245 unchanged 
- 1 fixed = 248 total (was 246) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 13s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m  
6s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 55s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
54s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}219m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12975147/HDFS-14313.007.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-18 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887859#comment-16887859
 ] 

Lisheng Sun commented on HDFS-14313:


Thank [~linyiqun] for your suggestions. 

 {quote}
why not use dataset lock?
{quote}
I don't use FsDatasetImpl#datasetLock , Because FsDatasetImpl#addBlockPool with 
datasetLock call FsDatasetImpl#deepCopyReplica in another Thread.  According to 
your suggestion, I use Collections.unmodifiableSet to make replica info is not 
allowed to be modified outside.
And I have updated this patch. Could you help reiview it? Thank you again.


> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-17 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887592#comment-16887592
 ] 

Yiqun Lin commented on HDFS-14313:
--

I have two comments for the core implementation of the 
ReplicaCachingGetSpaceUsed.

{code}
 @Override
  public synchronized Set deepCopyReplica(String bpid)
  throws IOException {
Set replicas =
new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET
: volumeMap.replicas(bpid));
return replicas;
  }
{code}
why not use dataset lock? Here the volumeMap is thread-safe controlled by the 
dataset lock in FsDatasetImpl operations. Synchronized cannot ensure the race 
condition. Also it would be better to use {{ Collections.unmodifiableSet}} to 
make replica info is not allowed to be modified outside.

{code}
if (CollectionUtils.isNotEmpty(replicaInfos)) {
+for (ReplicaInfo replicaInfo : replicaInfos) {
+  if (Objects.equals(replicaInfo.getVolume().getStorageID(),
+  volume.getStorageID())) {
+dfsUsed += replicaInfo.getBytesOnDisk();
+dfsUsed += replicaInfo.getMetadataLength();
+count++;
+  }
+}
+  }
{code}
I think {{replicaInfo.getBlockDataLength}} will be more accurate than 
{{replicaInfo.getBytesOnDisk}}. It gets the length from actual file while 
getBytesOnDisk is from block inner variable.

Will take detail review soon.


> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-17 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887541#comment-16887541
 ] 

Lisheng Sun commented on HDFS-14313:


Hi [~jojochuang] [~linyiqun] [~stack]  I have update this patch and Could you 
have time to review it ? Thank you.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-11 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883490#comment-16883490
 ] 

Lisheng Sun commented on HDFS-14313:


Ping [~jojochuang]:) Could you continue to help review it? Thank you.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-08 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880463#comment-16880463
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
6s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m  
4s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 56s{color} | {color:orange} root: The patch generated 2 new + 245 unchanged 
- 1 fixed = 247 total (was 246) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
51s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 37s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}191m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.tools.TestHdfsConfigFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973924/HDFS-14313.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 2cf75600d9a4 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-08 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880454#comment-16880454
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
59s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m  4s{color} | {color:orange} root: The patch generated 2 new + 245 unchanged 
- 1 fixed = 247 total (was 246) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
21s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 15s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}191m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.tools.TestHdfsConfigFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973922/HDFS-14313.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-08 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880257#comment-16880257
 ] 

Lisheng Sun commented on HDFS-14313:


Thank [~jojochuang] [~hexiaoqiao] for reviewing.

 I have update patch for 006 as your comments [~jojochuang]. Could you have 
time to review it? Thank you.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-08 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880052#comment-16880052
 ] 

Wei-Chiu Chuang commented on HDFS-14313:


Thank you [~leosun08]. I was out for a few days.

I think overall the patch is almost ready. Please take care of a few nits that 
I spotted.

ReplicaCachingGetSpaceUsed#run() would throw an NPE if ExternalDatasetImpl is 
used since ExternalDatasetImpl#deepCopyReplica() returns a null. IMO, it should 
throw an exception to indicate it is not supported, or return an empty 
Collection.

For the new configuration keys, please update them with a prefix. For example, 
deep.copy.replica.threshold.ms --> 
fs.getspaceused.deep.copy.replica.threshold.ms

Please add description of the new configurations into 
hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-07 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880028#comment-16880028
 ] 

He Xiaoqiao commented on HDFS-14313:


[~leosun08] Thanks for your report + patch this issue. And sorry for missing 
this information. I would like to take review this week. Thanks again.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16879882#comment-16879882
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
57s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
28s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 16s{color} | {color:orange} root: The patch generated 2 new + 245 unchanged 
- 1 fixed = 247 total (was 246) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
22s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 59s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}192m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973852/HDFS-14313.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cd4c7e158ee8 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-05 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16879582#comment-16879582
 ] 

Lisheng Sun commented on HDFS-14313:


Hi [~elgoiri] [~jojochuang] [~zvenczel] [~hexiaoqiao] Could you have time to 
review this patch? Thank you .

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-02 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877400#comment-16877400
 ] 

Lisheng Sun commented on HDFS-14313:


ping [~jojochuang] Could you have time to review this patch? Thank you.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877007#comment-16877007
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  7m 
50s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
11s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
19s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m  3s{color} | {color:orange} root: The patch generated 4 new + 246 unchanged 
- 1 fixed = 250 total (was 247) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 13s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
40s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 50s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}204m  8s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.TestStateAlignmentContextWithHA |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973410/HDFS-14313.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 964a5ce8086c 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-06-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872123#comment-16872123
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
4s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 55s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
11s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m 
50s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 28s{color} | {color:orange} root: The patch generated 10 new + 245 unchanged 
- 1 fixed = 255 total (was 246) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
46s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 27s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  1m  
0s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}202m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDatanodeDeath |
|   | hadoop.hdfs.TestErasureCodingExerciseAPIs |
|   | hadoop.hdfs.server.namenode.ha.TestHAMetrics |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.server.diskbalancer.TestDiskBalancer |
|   | hadoop.hdfs.TestDFSStorageStateRecovery |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-06-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870885#comment-16870885
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  7m 
46s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
17s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
0s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 22s{color} | {color:orange} root: The patch generated 10 new + 245 unchanged 
- 1 fixed = 255 total (was 246) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 51s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
49s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 45s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
48s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}205m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.diskbalancer.TestDiskBalancer |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972682/HDFS-14313.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c01c410e47ff 4.4.0-143-generic #169~14.04.2-Ubuntu SMP Wed Feb 
13 15:00:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-06-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870832#comment-16870832
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
7s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
19s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m  7s{color} | {color:orange} root: The patch generated 10 new + 245 unchanged 
- 1 fixed = 255 total (was 246) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m  
7s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 28s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
54s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}192m 32s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.diskbalancer.TestDiskBalancer |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972681/HDFS-14313.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux aff07932e8a2 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-06-23 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870797#comment-16870797
 ] 

Lisheng Sun commented on HDFS-14313:


[~jojochuang] Thanks for your comment .I add synchronized to 
FsDatasetImpl#deepCopyReplica.
{code:java}
 @Override
  public synchronized Set deepCopyReplica(String bpid)
  throws IOException {
Set replicas =
new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET
: volumeMap.replicas(bpid));
return replicas;
  }
{code}
I don't use FsDatasetImpl#datasetLock , Because FsDatasetImpl#addBlockPool with 
datasetLock call FsDatasetImpl#deepCopyReplica in another Thread.

Please continue to help review code. Please correct me if I am wrong. Thanks.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-06-23 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870773#comment-16870773
 ] 

Lisheng Sun commented on HDFS-14313:


[~jojochuang] Thanks for your comments.  I add lock in 
FsDatasetImpl#deepCopyReplica.
{code:java}
@Override
public Set deepCopyReplica(String bpid)
throws IOException {
  try (AutoCloseableLock lock = datasetLock.acquire()) {
Set replicas =
new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET
: volumeMap.replicas(bpid));
return replicas;
  }
}
{code}

Please continue to help review this issue. Thank you.


> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-06-21 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869692#comment-16869692
 ] 

Wei-Chiu Chuang commented on HDFS-14313:


{code:title=ReplicaMap.java}
/**
   * Get a collection of the replicas for given block pool
   * This method is not synchronized. It needs to be synchronized
   * externally using the lock, both for getting the replicas
   * values from the map and iterating over it. Mutex can be accessed using
   * {@link #getLock()} method.
   * 
   * @param bpid block pool id
   * @return a collection of the replicas belonging to the block pool
   */
  Collection replicas(String bpid) {
return map.get(bpid);
  }
{code}

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-06-21 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869322#comment-16869322
 ] 

Lisheng Sun commented on HDFS-14313:


[~jojochuang] Thans for your review.

{{{quote}}}

{{Additionally, what's the memory consumption look like? I assume it doubles 
DataNode memory usage.}}

{{{quote}}}{{}}

{{It cost memory far less than twice DataNode memory usage. }}
{code:java}
Collection replicaInfos =
(Collection) fsDataset.deepCopyReplica(bpid);
{code}
{{Because }}ReplicaCachingGetSpaceUsed#replicaInfos is local variable and used 
for every block pool, and it also increases reference count. 

Increased memory consumption is  about 4 bytes(8 bytes) * ReplicaInfos.

{quote}

synchronization: I find it hard to believe that FsDatasetImpl#deepCopyReplica() 
is not synchronized to avoid data race.

{quote}

I can't fully understand this synchronization problem. You worry about data 
race happen?

 

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-06-21 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869237#comment-16869237
 ] 

Wei-Chiu Chuang commented on HDFS-14313:


Thanks.
I really like the benchmark. Do you happen to have the benchmark for 10 million 
block replicas? I guess for HBase clusters, DataNodes are not dense. But these 
days it is not uncommon to find DataNodes with 7 or even 10 million blocks. 
It's fine if you don't have that, and I just wanted to make sure.

Additionally, what's the memory consumption look like? I assume it doubles 
DataNode memory usage.

synchronization: I find it hard to believe that FsDatasetImpl#deepCopyReplica() 
is not synchronized to avoid data race.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-06-21 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869213#comment-16869213
 ] 

Lisheng Sun commented on HDFS-14313:


ping [~jojochuang] 

  I don’t know I explain whether to clear.  The feature has been running for a 
long time for XiaoMi HBase service and XiaoMi Data stream service that these 
need high performance requirements. Thanks.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-06-16 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865274#comment-16865274
 ] 

Lisheng Sun commented on HDFS-14313:


[~jojochuang] Thanks for your review.
{quote}FsDatasetImpl#deepCopyReplica() needs to be synchronized on datasetLock. 
Additionally I am not sure the impact of creating creating millions of 
ReplicaInfo for dense DataNodes. That sounds quite heavy.
{quote}
I consider this point when developing the feature. And do test.
||Count(ReplicaInfo) of DataNode  ||Time of  synchronized on  datasetLock||
|1039864|314ms|
|343866|89ms|

We get same result  hundred milliseconds of  millions ReplicaInfos  in 
production environment. The feature has been running for a long time for XiaoMi 
HBase service and does not affect their performance. Thanks.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-06-15 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864724#comment-16864724
 ] 

Wei-Chiu Chuang commented on HDFS-14313:


I think the motivation makes sense. A few things that caught my attention

FsDatasetImpl#deepCopyReplica() needs to be synchronized on datasetLock. 
Additionally I am not sure the impact of creating creating millions of 
ReplicaInfo for dense DataNodes. That sounds quite heavy.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-06-10 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859745#comment-16859745
 ] 

Lisheng Sun commented on HDFS-14313:


[~joepe] [~zhuqi] [~zvenczel] [~candychencan] Could you please take a look?

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-05-14 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839140#comment-16839140
 ] 

Lisheng Sun commented on HDFS-14313:


Test failures are not caused by this patch. 

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-04-03 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809530#comment-16809530
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
55s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 46s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 56s{color} | {color:orange} root: The patch generated 12 new + 245 unchanged 
- 1 fixed = 257 total (was 246) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 59s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
35s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 36s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
41s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}186m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12964800/HDFS-14313.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 6a24c792fe5b 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 7b5b783 |
| maven | version: 

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-04-03 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808757#comment-16808757
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
59s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
36s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  2m  
5s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  2m  5s{color} 
| {color:red} root in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 18s{color} | {color:orange} root: The patch generated 15 new + 251 unchanged 
- 1 fixed = 266 total (was 252) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
37s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  3m  
1s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
39s{color} | {color:red} hadoop-common-project/hadoop-common generated 2 new + 
0 unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
10s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 37s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
23s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 90m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-common-project/hadoop-common |
|  |  Found reliance on default encoding in 
org.apache.hadoop.fs.CachingGetSpaceUsed.saveSpaceUsed():in 
org.apache.hadoop.fs.CachingGetSpaceUsed.saveSpaceUsed(): new 
java.io.FileWriter(File)  At CachingGetSpaceUsed.java:[line 177] |
|  |  org.apache.hadoop.fs.CachingGetSpaceUsed.saveSpaceUsed() may fail to 
clean up java.io.Writer on checked exception  Obligation to clean up resource 
created at CachingGetSpaceUsed.java:up java.io.Writer on checked exception  
Obligation to clean up resource created at CachingGetSpaceUsed.java:[line 177] 
is