[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942401#comment-16942401 ] Wei-Chiu Chuang commented on HDFS-14313: Conflicts are trivial so I pushed them into branch-3.2 and branch-3.1. Patch attached to the jira for posterity. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14313-branch-2.v1.patch, > HDFS-14313-branch-2.v2.patch, HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch, HDFS-14313.branch-3.0.v1.patch, > HDFS-14313.branch-3.0.v2.patch, HDFS-14313.branch-3.1.patch, > HDFS-14313.branch-3.2.patch, HDFS-14313.branch-3.v1.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942389#comment-16942389 ] Wei-Chiu Chuang commented on HDFS-14313: This is missing from branch-3.2 and branch-3.1 unfortunately ... > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.3.0 > > Attachments: HDFS-14313-branch-2.v1.patch, > HDFS-14313-branch-2.v2.patch, HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch, HDFS-14313.branch-3.0.v1.patch, > HDFS-14313.branch-3.0.v2.patch, HDFS-14313.branch-3.v1.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902154#comment-16902154 ] Lisheng Sun commented on HDFS-14313: the branch-2.v2 for patch fixes unchecked issue in javac. It can be ignored. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14313-branch-2.v1.patch, > HDFS-14313-branch-2.v2.patch, HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch, HDFS-14313.branch-3.0.v1.patch, > HDFS-14313.branch-3.0.v2.patch, HDFS-14313.branch-3.v1.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902141#comment-16902141 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s{color} | {color:red} HDFS-14313 does not apply to branch-2. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12976946/HDFS-14313-branch-2.v2.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27440/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14313-branch-2.v1.patch, > HDFS-14313-branch-2.v2.patch, HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch, HDFS-14313.branch-3.0.v1.patch, > HDFS-14313.branch-3.0.v2.patch, HDFS-14313.branch-3.v1.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902024#comment-16902024 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 5s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 10s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 47s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 22s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 45s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 13s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 41s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 29s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 52s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 11m 52s{color} | {color:red} root-jdk1.7.0_95 with JDK v1.7.0_95 generated 3 new + 1441 unchanged - 3 fixed = 1444 total (was 1444) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 31s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 10m 31s{color} | {color:red} root-jdk1.8.0_222 with JDK v1.8.0_222 generated 4 new + 1342 unchanged - 4 fixed = 1346 total (was 1346) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 44s{color} | {color:orange} root: The patch generated 3 new + 324 unchanged - 1 fixed = 327 total (was 325) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 16s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 49s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 53s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}167m 40s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902009#comment-16902009 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 7s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} branch-3.0 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 34s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 37s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 21s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 56s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 17s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 5s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 40s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s{color} | {color:green} branch-3.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 19s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 59s{color} | {color:orange} root: The patch generated 3 new + 296 unchanged - 1 fixed = 299 total (was 297) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 39s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 8s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}191m 14s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.util.TestReadWriteDiskValidator | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:e402791 | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12976909/HDFS-14313.branch-3.0.v2.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux f9b46a824269 4.4.0-157-generic #185-Ubuntu SMP Tue Jul 23 09:17:01 UTC 2019 x86_64 x86_64 x86_64 GNU
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901928#comment-16901928 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} branch-3.0 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 31s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 14s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 45s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 52s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 9m 24s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 21m 8s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 16s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 19s{color} | {color:green} branch-3.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 4s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 59s{color} | {color:orange} root: The patch generated 3 new + 296 unchanged - 1 fixed = 299 total (was 297) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 10m 17s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 58s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 27s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}155m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ha.TestZKFailoverController | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:e402791 | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12976897/HDFS-14313.branch-3.0.v1.patch | | Optional Tests | dupname asflicense compile javac jav
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901783#comment-16901783 ] Lisheng Sun commented on HDFS-14313: Thank [~linyiqun] for your suggestion. I rename patch for branch-3.x and upload the branch-3.0.v1 patch. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch, HDFS-14313.branch-3.0.v1.patch, > HDFS-14313.branch-3.v1.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901781#comment-16901781 ] Yiqun Lin commented on HDFS-14313: -- I don't mean you need to patch for all branches, actually two patches are enough, branch-3 and branch-2. Since branch-3 seems not be updated, so I suggest you to patch for branch-3.0 instead of. So if you attach the patch for branch-2, please use the name HDFS-14313-branch-2.v1.patch. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch, HDFS-14313.branch-3.v1.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901778#comment-16901778 ] Lisheng Sun commented on HDFS-14313: I attached patch for branch-3. you hope I attach patch for branch-3.0, branch-3.1, branch3.2 and branch2.9. Instead of branch-3 and branch-2? I think I was misunderstanding. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch, HDFS-14313.branch-3.v1.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901765#comment-16901765 ] Yiqun Lin commented on HDFS-14313: -- [~leosun08], you need to rename your branch patch to a correct name. Here we need to contain the branch name, e.g. {{HDFS-14313-BRANCH_NAME.patch}}. So if we want to patch for branch-3.0, the patch name should be HDFS-14313-branch-3.0.v1.patch rather than HDFS-14313.branch-3.v1.patch. The BRANCH_NAME will be detected and trigger the Jenkins for corresponding branch. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch, HDFS-14313.branch-3.v1.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901758#comment-16901758 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 13m 56s{color} | {color:red} Docker failed to build yetus/hadoop:20ca6773f1e. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12976891/HDFS-14313.branch-3.v1.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27429/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch, HDFS-14313.branch-3.v1.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901736#comment-16901736 ] Lisheng Sun commented on HDFS-14313: Thanx [~linyiqun] for all your work for this patch. I will attach the patch for branch-3.x and branch-2.x branches later. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901635#comment-16901635 ] Hudson commented on HDFS-14313: --- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17055 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17055/]) HDFS-14313. Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo (yqlin: rev a5bb1e8ee871dfff77d0f6921b13c8ffb50e) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java * (add) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSCachingGetSpaceUsed.java * (add) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/ReplicaCachingGetSpaceUsed.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GetSpaceUsed.java * (add) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestReplicaCachingGetSpaceUsed.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java * (edit) hadoop-common-project/hadoop-common/src/main/resources/core-default.xml > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901632#comment-16901632 ] Yiqun Lin commented on HDFS-14313: -- I have committed this to trunk, but find the conflicts when backport to branch-3.x and branch-2.x branches. [~leosun08], would you mind attach the patch for those branches? I think this improvement is very helpful and can be backported in these versions. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901628#comment-16901628 ] Yiqun Lin commented on HDFS-14313: -- LGTM, +1. Committing this. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901393#comment-16901393 ] Wei-Chiu Chuang commented on HDFS-14313: +1 > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901388#comment-16901388 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 44s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 10s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 16s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 40s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 1s{color} | {color:orange} root: The patch generated 3 new + 284 unchanged - 1 fixed = 287 total (was 285) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 6s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 23s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 23s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 54s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}191m 15s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestLargeBlockReport | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12976822/HDFS-14313.014.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 21b0e4638ad6 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/pers
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901146#comment-16901146 ] Lisheng Sun commented on HDFS-14313: Fix UT in hadoop.conf.TestCommonConfigurationFields and other UTs fails is unrelated to this patch. Upload the v014 patch. Please help review it. Thank [~linyiqun]. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch, > HDFS-14313.014.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901103#comment-16901103 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 0s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 0s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 40s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 48s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 16s{color} | {color:orange} root: The patch generated 1 new + 171 unchanged - 1 fixed = 172 total (was 172) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 22s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 59s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}110m 48s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 53s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}228m 25s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.conf.TestCommonConfigurationFields | | | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.server.datanode.TestLargeBlockReport | | | hadoop.hdfs.server.balancer.TestBalancerRPCDelay | | | hadoop.hdfs.TestDecommission | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12976803/HDFS-14313.013.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 6
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900863#comment-16900863 ] Lisheng Sun commented on HDFS-14313: Thank [~linyiqun] for your reminder. I rebased the code, updated description about config key "fs.getspaceused.classname" in core-default.xml and uploaded the v013 patch. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch, HDFS-14313.013.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900849#comment-16900849 ] Yiqun Lin commented on HDFS-14313: -- [~leosun08], can you rebase the code? I applied the v012 patch in my local, it showed that the core-default.xml cannot be applied. You may git pull the latest code and then generate the new patch file. {noformat} error: patch failed: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml:3494 error: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml: patch does not apply {noformat} Also can you update the description to following? From the original design, GetSpaceUsed is not only used for telling HDFS space used. {noformat} The class that can tell estimate much space is used in a directory. There are four impl classes that being supported: org.apache.hadoop.fs.DU(default), org.apache.hadoop.fs.WindowsGetSpaceUsed org.apache.hadoop.fs.DFCachingGetSpaceUsed and org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed. And the ReplicaCachingGetSpaceUsed impl class only used in HDFS module. {noformat} > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900753#comment-16900753 ] Lisheng Sun commented on HDFS-14313: {quote} HDFS-14313 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {quote} That I only update core-default.xml leads to this problem. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900748#comment-16900748 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s{color} | {color:red} HDFS-14313 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12976792/HDFS-14313.012.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27413/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900724#comment-16900724 ] Lisheng Sun commented on HDFS-14313: Thanx [~linyiqun] [~jojochuang] for your good suggestions. I added the key fs.getspaceused.classname and the usage of these four impl classes in core-default.xml. I have uloaded the v12 patch. Thank you again. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch, HDFS-14313.012.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900699#comment-16900699 ] Yiqun Lin commented on HDFS-14313: -- [~leosun08], at least, I think add the key fs.getspaceused.classname in core-default.xml should be okay as I see this key is missing for a long time. {quote}That ReplicaCachingGetSpaceUsed of hdfs module is added in core-default.xml of common module should not be very good {quote} As [~jojochuang] mentioned, we can document the usage of these four impl classes in config's description to let users know. It will be okay I think. Except this, others is ready to go, :). This will be a very good improvement. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900563#comment-16900563 ] Lisheng Sun commented on HDFS-14313: Thank [~jojochuang] for your suggestion. {quote}One thing that's missing out in the v11 is the update to hdfs-default.xml which was missing since v8. {quote} According to [~linyiqun] suggetion, I define a hard-coded threadold time value like 1000ms in ReplicaCachingGetSpaceUsed. So remove config in hdfs-default.xml. {code:java} private static final long DEEP_COPY_REPLICA_THRESHOLD_MS = 50; private static final long REPLICA_CACHING_GET_SPACE_USED_THRESHOLD_MS = 1000; {code} {quote} Additionally there should be an additional config key "fs.getspaceused.classname" in core-default.xml, and state that possible options are org.apache.hadoop.fs.DU (default) org.apache.hadoop.fs.WindowsGetSpaceUsed org.apache.hadoop.fs.DFCachingGetSpaceUsed org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed {quote} That ReplicaCachingGetSpaceUsed of hdfs module is added in core-default.xml of common module should not be very good. Please correct me if I was wrong. Thank you again. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900561#comment-16900561 ] Yiqun Lin commented on HDFS-14313: -- {quote} One thing that's missing out in the v11 is the update to hdfs-default.xml which was missing since v8. {quote} Not fully get this point. The print time threshold key is no longer needed now. Does this mean we need to add fs.getspaceused.classname key in hdfs-default as well? Hi [~leosun08], can you additionally address the [~jojochuang]'s comment? > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900549#comment-16900549 ] Wei-Chiu Chuang commented on HDFS-14313: Hi [~leosun08] thanks a lot for continued work here, and [~linyiqun] for continued review while I dropped out. I think v11 patch is good. [~linyiqun] if you are ok feel free to commit. One thing that's missing out in the v11 is the update to hdfs-default.xml which was missing since v8. Additionally there should be an additional config key "fs.getspaceused.classname" in core-default.xml, and state that possible options are # org.apache.hadoop.fs.DU (default) # org.apache.hadoop.fs.WindowsGetSpaceUsed # org.apache.hadoop.fs.DFCachingGetSpaceUsed # org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900515#comment-16900515 ] Lisheng Sun commented on HDFS-14313: Thanx [~linyiqun] for your deep review. I updated this patch as your comments. checkstyle issue: {code:java} ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSCachingGetSpaceUsed.java:64: public Builder setBpid(String bpid) {:35: 'bpid' hides a field. [HiddenField] {code} I think it is not a problem. I haved uploaded the v11 patch. Could you help review it? Thank you. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, > HDFS-14313.011.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900239#comment-16900239 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 51s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 23m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 21m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 45s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 20m 58s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 59s{color} | {color:orange} root: The patch generated 1 new + 171 unchanged - 1 fixed = 172 total (was 172) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 4s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 8s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}131m 40s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 4s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}274m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.server.namenode.TestFSImageWithXAttr | | | hadoop.hdfs.server.namenode.TestNameNodeMXBean | | | hadoop.hdfs.server.datanode.TestLargeBlockReport | | | hadoop.hdfs.TestErasureCodingMultipleRacks | | | hadoop.hdfs.server.namenode.TestNamenodeRetryCache | | | hadoop.hdfs.TestFileAppend3 | | | hadoop.hdfs.client.impl.TestBlockReaderLocal | | | hadoop.hdfs.server.mover.TestMover | | | hadoop.hdfs.TestErasureCodingPolicies | | | hadoop.hdfs.TestDFSFinalize | | | hadoop.hdfs.server.mover.TestStorageMover | | | hadoop.hdfs.TestDFSPermission | | | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport | | | hadoop.hdfs.tools.TestDFSZKFailoverController | \\ \\ || Subsystem || Report/Notes
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899932#comment-16899932 ] Yiqun Lin commented on HDFS-14313: -- Two review comments: I'd like to decorate the javadoc comment to a more readable way. From {noformat} Fast and accurate class to tell how much space HDFS is using. This class get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory for HDFS. Get hdfs used space by ReplicaCachingGetSpaceUsed impl only includes block and meta, but get space used by DU impl includes include all directories size, other files such as VERSION, in_use.lock and so on. Get space used by DU impl must be greater than by ReplicaCachingGetSpaceUsed impl. Get space used by ReplicaCachingGetSpaceUsed impl is more accurate. setting fs.getspaceused.classname to org.apache.hadoop.hdfs.server.datanode.fsdataset impl.ReplicaCachingGetSpaceUsed in your core-site.xml if we want to enable this class. {noformat} To {noformat} Fast and accurate class to tell how much space HDFS is using. This class gets hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos that uses an in memory way. Getting hdfs used space by ReplicaCachingGetSpaceUsed impl only includes block and meta files, but DU impl is blockpool dir based statistics that will include additional files, e.g. tmp dir, scanner.cursor file. Getting space used by DU impl will be greater than by ReplicaCachingGetSpaceUsed impl, but the latter is more accurate. Setting fs.getspaceused.classname to org.apache.hadoop.hdfs.server.datanode.fsdataset impl.ReplicaCachingGetSpaceUsed in your core-site.xml if we want to enable this class. {noformat} {noformat} os.close(); fs.delete(new Path("/testReplicaCachingGetSpaceUsedByRBWReplica"), true); {noformat} Can we add the asset operation again after close operation? After close operation, the replica state will be transformed from RBW to finalized. But the space used of these replicas are all included and the dfsUsed value should be same. {noformat} os.close(); assertEquals(blockLength + metaLength, dataNode.getFSDataset().getDfsUsed()); fs.delete(new Path("/testReplicaCachingGetSpaceUsedByRBWReplica"), true); {noformat} > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899565#comment-16899565 ] Lisheng Sun commented on HDFS-14313: ping [~linyiqun] Could you mind taking a review for this patch? Thank you. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898122#comment-16898122 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 38s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 7s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 47s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 40s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 4s{color} | {color:orange} root: The patch generated 3 new + 171 unchanged - 1 fixed = 174 total (was 172) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 45s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 37s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 55s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}189m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport | | | hadoop.hdfs.server.datanode.TestLargeBlockReport | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12976426/HDFS-14313.010.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 3084f477ad20 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a7371a7 | | maven | version: Apache Maven 3.
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897988#comment-16897988 ] Lisheng Sun commented on HDFS-14313: Thanx for [~linyiqun] for deep review. I have updated this patch as your comments and uploaded the v10 patch. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897804#comment-16897804 ] Yiqun Lin commented on HDFS-14313: -- Almost loogs good now, some minor comments from me: {noformat} The deepCopyReplica call does't use the datasetock s {noformat} One typo , does't --> doesn't {noformat} setting set fs.getspaceused.classname {noformat} please remove the redundant set and update setting to Setting. {noformat} "blockPoolId: {}, replicas size: {}, copy replicas duration: {}ms" {noformat} Can you update to {noformat} Copy replica infos, blockPoolId: {}, replicas size: {}, duration: {}ms" {noformat} Update refresh to Refresh. {noformat} fs.getClient().delete("/testReplicaCachingGetSpaceUsed", true); {noformat} We can directly call the filesystem api to delete file. {noformat} fs.delete(new Path("/testReplicaCachingGetSpaceUsed"), true); {noformat} {quote}Get space used by DU impl must be greater than by ReplicaCachingGetSpaceUsed impl. get space used by ReplicaCachingGetSpaceUsed impl is more accurate. so is it necessary to add comparison for the DU impl class? {quote} You have raised up a good point the ReplicaCachingGetSpaceUsed way will only calculate the finalized blocks while du command way includes more files. Can you comment this important difference in the javadoc comment of class ReplicaCachingGetSpaceUsed? We should let others know which files this class will calculate for. Yes, the calculation way is different now. Can you add an additional test to test with the case that some blocks files are not finalized, for example being rbw state? And then we check if the dfsused is correctly be updated. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896696#comment-16896696 ] Lisheng Sun commented on HDFS-14313: hi [~linyiqun] Could you have time to continue review ? Thank you. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894989#comment-16894989 ] Hadoop QA commented on HDFS-14313: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 2s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 16s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 16s{color} | {color:orange} root: The patch generated 3 new + 171 unchanged - 1 fixed = 174 total (was 172) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 22s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 17s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 78m 51s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 57s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}191m 55s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12976087/HDFS-14313.009.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c7be60022bf8 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 02bd02b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/27325/artifact/out/diff-chec
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894898#comment-16894898 ] Lisheng Sun commented on HDFS-14313: Updated patch according the review comments and uploaded the v9 patch. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch, HDFS-14313.009.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894893#comment-16894893 ] Lisheng Sun commented on HDFS-14313: Thanx [~linyiqun] for your review. *FSCachingGetSpaceUsed* {quote}Line 53: Add the final keyword for the FsVolumeImpl variable. Line 54: Add the final keyword too. {quote} The value of volume and bpid is assignment by set*, so don't add the final keyword for these two variable. {quote}Line 75: We don't pass the config to use the threshold time now, still we need to override this method? If don't need, the change made in GetSpaceUsed can also be reverted. {quote} Add the new variable volume and bpid of FSCachingGetSpaceUsed for HDFS module,so ReplicaCachingGetSpaceUsed‘s Constructor parameter must be FSCachingGetSpaceUsed#Builder and don't remove FSCachingGetSpaceUsed#build(). * TestReplicaCachingGetSpaceUsed* {quote}Line 69: As I have mentioned before, can we have an additional comparison for the DU impl class? The most of lines can be reused for these two getused impl class. Just passing different key value with restart the mini cluster and comparing the used space. {quote} get space used by DU impl include ├── current │ ├── BP-1876464514-10.239.56.179-1564369203299 │ │ ├── current │ │ │ ├── VERSION │ │ │ ├── finalized │ │ │ │ └── subdir0 │ │ │ │ └── subdir0 │ │ │ │ ├── blk_1073741825 │ │ │ │ └── blk_1073741825_1001.meta │ │ │ └── rbw │ │ ├── scanner.cursor │ │ └── tmp │ └── VERSION └── in_use.lock get space used by ReplicaCachingGetSpaceUsed impl include ├── blk_1073741825 └── blk_1073741825_1001.meta get space used by DU impl include all directories size, other files such as VERSION, in_use.lock and so on Get space used by DU impl must be greater than by ReplicaCachingGetSpaceUsed impl. get space used by ReplicaCachingGetSpaceUsed impl is more accurate. so is it necessary to add comparison for the DU impl class? Please correct me if I was wrong. Thank [~linyiqun] again. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894619#comment-16894619 ] Yiqun Lin commented on HDFS-14313: -- Thanks for updating the patch, [~leosun08]. Some review comments for the patch: *FSCachingGetSpaceUsed* Line 35: Can we update the annotation to {{@InterfaceAudience.Private}} since this is only specified for HDFS module.? Line 38: Remove the public keyword for LOG instance. Line 53: Add the final keyword for the FsVolumeImpl variable. Line 54: Add the final keyword too. Line 75: We don't pass the config to use the threshold time now, still we need to override this method? If don't need, the change made in GetSpaceUsed can also be reverted. *FsDatasetImpl* {noformat} FsVolumeList#addBlockPool -> FsVolumeImpl#addBlockPool -> new BlockPoolSlice -> FsDatasetImpl#deepCopyReplica. If deepCopyReplica use datasetock, it appears deadlock. {noformat} This comment can simplified to 'The deepCopyReplica call does't use the datasetock since it will lead the potential deadlock with the {@link FsVolumeList#addBlockPool} call.' *ReplicaCachingGetSpaceUsed* Line 41: {noformat} To use set fs.getspaceused.classname to org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed in your core-site.xml. {noformat} can update to a more readable way: {noformat} Setting fs.getspaceused.classname to org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed in core-site.xml if we want to enable this class. {noformat} Line 46: Update the annotation to {{@InterfaceAudience.Private}}. Line 49: Remove the public keyword. Line 51: Add the final keyword. Line 52: Add the final keyword. Line76: I would prefer to define a static final variable to hard-coded the value in ReplicaCachingGetSpaceUsed. Line 77: Can we use the parameter way to print the log? Like LOG.debug("BlockPoolId: {}, replicas size: {}, copy replicas duration: {}ms.", obj1, obj2...} Line 95: The same comment for Line 76. The 96:The same comment for Line 77. Line 101: Update replicaCachingGetSpaceUsed to ReplicaCachingGetSpaceUsed. *TestReplicaCachingGetSpaceUsed* Line 42: Please add the definition comment for this class. Line 51: We can use Configuration#setClass call to set the class impl {noformat} conf.setClass("fs.getspaceused.classname", ReplicaCachingGetSpaceUsed.class, CachingGetSpaceUsed.class); {noformat} Line 61: It will be a good behavior to clean the hdfs path we created for the test. Line 69: As I have mentioned before, can we have an additional comparison for the DU impl class? The most of lines can be reused for these two getused impl class. Just passing different key value with restart the mini cluster and comparing the used space. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894465#comment-16894465 ] Lisheng Sun commented on HDFS-14313: hi [~linyiqun] [~jojochuang] Could you have time to review this patch? Thank you a lot. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893159#comment-16893159 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 8m 50s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 41s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 5s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 52s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 16s{color} | {color:orange} root: The patch generated 4 new + 172 unchanged - 1 fixed = 176 total (was 173) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 18s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 34s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 49s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}201m 42s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.0 Server=19.03.0 Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12975834/HDFS-14313.008.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6798be829d86 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/p
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892921#comment-16892921 ] Lisheng Sun commented on HDFS-14313: Thank [~linyiqun] for your review. I have update the patch as your comments. Could you help review it? Thank you. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch, > HDFS-14313.008.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891132#comment-16891132 ] Yiqun Lin commented on HDFS-14313: -- Hi [~leosun08], 1. {quote} I don't use the component specific impl class in the common module. to update in GetSpaceUsed of common model is just to make subclasses inheritable. And that to update in CommonConfigurationKeys of common model is to print threshold time,which should be moved to DFSConfigKeys and is more appropriate.. {quote} I don't think it's necessary to have two new config to make the threshold time configurable, the hard-coded way should be enough. I mean we can define a hard-coded threadold time value like 1000ms in ReplicaCachingGetSpaceUsed. So that we don't need to do any change in Common. 2. Can you add comment for the potential dead lock issue in the method deepCopyReplica? That will be let others know the context. Besides above two points, others make sense to me. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890237#comment-16890237 ] Lisheng Sun commented on HDFS-14313: hi [~linyiqun], could you mind continuing to reiview this patch? Thank you. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16889667#comment-16889667 ] Lisheng Sun commented on HDFS-14313: Thank [~linyiqun] for your review carefully. I have a few questions to discuss with you. 1 I don't use the component specific impl class in the common module. to update in GetSpaceUsed of common model is just to make subclasses inheritable. And that to update in CommonConfigurationKeys of common model is to print threshold time,which should be moved to DFSConfigKeys and is more appropriate. 2. Now there are switches that are used in control. follow as GetSpaceUsed#Builder#CLASSNAME_KEY {code:java} // static final String CLASSNAME_KEY = "fs.getspaceused.classname"; {code} if add enableFSCachingGetSpace as you say, there are two switches that give the user more confusion. 3. {quote}Even though deepCopyReplica is only used by another thread, I still prefer to let it be an atomic operation incase this will be used in other places in the future. Can you add datasetock here? {quote} FsDatasetImpl#addBlockPool with datasetLock ->FsVolumeList#addBlockPool {code:java} @Override public void addBlockPool(String bpid, Configuration conf) throws IOException { LOG.info("Adding block pool " + bpid); try (AutoCloseableLock lock = datasetLock.acquire()) { volumes.addBlockPool(bpid, conf); volumeMap.initBlockPool(bpid); } volumes.getAllVolumesMap(bpid, volumeMap, ramDiskReplicaTracker); } {code} FsVolumeList#addBlockPool ->FsVolumeImpl#addBlockPool -> new BlockPoolSlice ->FsDatasetImpl#deepCopyReplica. If deepCopyReplica use datasetock, it appears deadlock. So use Collections.unmodifiableSet to make replica info is not allowed to be modified outside {code:java} void addBlockPool(final String bpid, final Configuration conf) throws IOException { long totalStartTime = Time.monotonicNow(); final Map unhealthyDataDirs = new ConcurrentHashMap(); List blockPoolAddingThreads = new ArrayList(); for (final FsVolumeImpl v : volumes) { Thread t = new Thread() { public void run() { try (FsVolumeReference ref = v.obtainReference()) { FsDatasetImpl.LOG.info("Scanning block pool " + bpid + " on volume " + v + "..."); long startTime = Time.monotonicNow(); v.addBlockPool(bpid, conf); long timeTaken = Time.monotonicNow() - startTime; FsDatasetImpl.LOG.info("Time taken to scan block pool " + bpid + " on " + v + ": " + timeTaken + "ms"); } catch (ClosedChannelException e) { // ignore. } catch (IOException ioe) { FsDatasetImpl.LOG.info("Caught exception while scanning " + v + ". Will throw later.", ioe); unhealthyDataDirs.put(v, ioe); } } }; blockPoolAddingThreads.add(t); t.start(); } {code} 4. According to your suggestion, I will modify UT for using real minicluster and adding a comparison test by respectively using FSCachingGetUsed and default Du way. Please correct me if I was wrong. Thank [~linyiqun] again. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888592#comment-16888592 ] Yiqun Lin commented on HDFS-14313: -- Take a deep review for the patch, this FSCachingGetSpaceUsed should be the HDFS specific impl class. I prefer not to make any change in the Common module. We can do this extend like HDFS DFSNetworkTopology did, add new config to enable or disable this. The instance will be got like this: {code:java} if(enableFSCachingGetSpace) { this.dfsUsage = new CachingGetSpaceUsed.Builder().setPath(bpDir) .setConf(conf) .setInitialUsed(loadDfsUsed()) .build(); } else { this.dfsUsage = new FSCachingGetSpaceUsed.Builder().setBpid(bpid) .setVolume(volume) .setPath(bpDir) .setConf(conf) .setInitialUsed(loadDfsUsed()) .build(); } {code} {code:java} + @Override + public Set deepCopyReplica(String bpid) + throws IOException { +Set replicas = +new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET +: volumeMap.replicas(bpid)); +return Collections.unmodifiableSet(replicas); + } {code} Even though deepCopyReplica is only used by another thread, I still prefer to let it be an atomic operation incase this will be used in other places in the future. Can you add datasetock here? {noformat} fs.getspaceused.deep.copy.replica.threshold.ms fs.getspaceused.replica.caching.threshold.ms {noformat} I don't think above two configs are really needed once we make this as the HDFS specific getused class. Instead of, we can still keep this using a hard-coded way. For the unit test {{TestReplicaCachingGetSpaceUsed}}, why not use a more real way to test this? I prefer to start up the real minicluster then write the actual blocks files and do the verification. Mocking way needs many steps to setup the DN and also not convenient to cover all the cases. Also we need a comparison test by respectively using FSCachingGetUsed and default Du way, to check if the dfsUsed calculated is same. [~leosun08], I suppose you may need to do a refactor for the latest patch. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888028#comment-16888028 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 42s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 3s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 36s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 32s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 11s{color} | {color:orange} root: The patch generated 3 new + 245 unchanged - 1 fixed = 248 total (was 246) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 57s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 23s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 39s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 47s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}217m 27s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.tools.TestHdfsConfigFields | | | hadoop.hdfs.TestMultipleNNPortQOP | | | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.8 Server=18.09.8 Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12975148/HDFS-14313.007.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite uni
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888023#comment-16888023 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 38s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 41s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 24s{color} | {color:orange} root: The patch generated 3 new + 245 unchanged - 1 fixed = 248 total (was 246) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 6s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 55s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 54s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}219m 44s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.tools.TestDFSZKFailoverController | | | hadoop.tools.TestHdfsConfigFields | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12975147/HDFS-14313.007.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbu
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887859#comment-16887859 ] Lisheng Sun commented on HDFS-14313: Thank [~linyiqun] for your suggestions. {quote} why not use dataset lock? {quote} I don't use FsDatasetImpl#datasetLock , Because FsDatasetImpl#addBlockPool with datasetLock call FsDatasetImpl#deepCopyReplica in another Thread. According to your suggestion, I use Collections.unmodifiableSet to make replica info is not allowed to be modified outside. And I have updated this patch. Could you help reiview it? Thank you again. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887592#comment-16887592 ] Yiqun Lin commented on HDFS-14313: -- I have two comments for the core implementation of the ReplicaCachingGetSpaceUsed. {code} @Override public synchronized Set deepCopyReplica(String bpid) throws IOException { Set replicas = new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET : volumeMap.replicas(bpid)); return replicas; } {code} why not use dataset lock? Here the volumeMap is thread-safe controlled by the dataset lock in FsDatasetImpl operations. Synchronized cannot ensure the race condition. Also it would be better to use {{ Collections.unmodifiableSet}} to make replica info is not allowed to be modified outside. {code} if (CollectionUtils.isNotEmpty(replicaInfos)) { +for (ReplicaInfo replicaInfo : replicaInfos) { + if (Objects.equals(replicaInfo.getVolume().getStorageID(), + volume.getStorageID())) { +dfsUsed += replicaInfo.getBytesOnDisk(); +dfsUsed += replicaInfo.getMetadataLength(); +count++; + } +} + } {code} I think {{replicaInfo.getBlockDataLength}} will be more accurate than {{replicaInfo.getBytesOnDisk}}. It gets the length from actual file while getBytesOnDisk is from block inner variable. Will take detail review soon. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887541#comment-16887541 ] Lisheng Sun commented on HDFS-14313: Hi [~jojochuang] [~linyiqun] [~stack] I have update this patch and Could you have time to review it ? Thank you. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883490#comment-16883490 ] Lisheng Sun commented on HDFS-14313: Ping [~jojochuang]:) Could you continue to help review it? Thank you. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880463#comment-16880463 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 6s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 13s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 4s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 56s{color} | {color:orange} root: The patch generated 2 new + 245 unchanged - 1 fixed = 247 total (was 246) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 51s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 37s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}191m 56s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.tools.TestHdfsConfigFields | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12973924/HDFS-14313.006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 2cf75600d9a4 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880454#comment-16880454 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 59s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 4s{color} | {color:orange} root: The patch generated 2 new + 245 unchanged - 1 fixed = 247 total (was 246) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 20s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 21s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 15s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}191m 56s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.tools.TestHdfsConfigFields | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12973922/HDFS-14313.006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyl
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880257#comment-16880257 ] Lisheng Sun commented on HDFS-14313: Thank [~jojochuang] [~hexiaoqiao] for reviewing. I have update patch for 006 as your comments [~jojochuang]. Could you have time to review it? Thank you. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880052#comment-16880052 ] Wei-Chiu Chuang commented on HDFS-14313: Thank you [~leosun08]. I was out for a few days. I think overall the patch is almost ready. Please take care of a few nits that I spotted. ReplicaCachingGetSpaceUsed#run() would throw an NPE if ExternalDatasetImpl is used since ExternalDatasetImpl#deepCopyReplica() returns a null. IMO, it should throw an exception to indicate it is not supported, or return an empty Collection. For the new configuration keys, please update them with a prefix. For example, deep.copy.replica.threshold.ms --> fs.getspaceused.deep.copy.replica.threshold.ms Please add description of the new configurations into hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880028#comment-16880028 ] He Xiaoqiao commented on HDFS-14313: [~leosun08] Thanks for your report + patch this issue. And sorry for missing this information. I would like to take review this week. Thanks again. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879882#comment-16879882 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 15s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 28s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 16s{color} | {color:orange} root: The patch generated 2 new + 245 unchanged - 1 fixed = 247 total (was 246) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 22s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 59s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}192m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestMultipleNNPortQOP | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.server.datanode.TestDataNodeMetrics | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12973852/HDFS-14313.005.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux cd4c7e158ee8 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879582#comment-16879582 ] Lisheng Sun commented on HDFS-14313: Hi [~elgoiri] [~jojochuang] [~zvenczel] [~hexiaoqiao] Could you have time to review this patch? Thank you . > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877400#comment-16877400 ] Lisheng Sun commented on HDFS-14313: ping [~jojochuang] Could you have time to review this patch? Thank you. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877007#comment-16877007 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 7m 50s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 2s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 19s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 3s{color} | {color:orange} root: The patch generated 4 new + 246 unchanged - 1 fixed = 250 total (was 247) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 40s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 50s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 43s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}204m 8s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.TestStateAlignmentContextWithHA | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.web.TestWebHdfsTimeouts | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12973410/HDFS-14313.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 964a5ce8086c 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GN
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16872123#comment-16872123 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 4s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 55s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 2m 50s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 46s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 39s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 28s{color} | {color:orange} root: The patch generated 10 new + 245 unchanged - 1 fixed = 255 total (was 246) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 31s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 46s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 27s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 1m 0s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}202m 33s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDatanodeDeath | | | hadoop.hdfs.TestErasureCodingExerciseAPIs | | | hadoop.hdfs.server.namenode.ha.TestHAMetrics | | | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy | | | hadoop.hdfs.server.diskbalancer.TestDiskBalancer | | | hadoop.hdfs.TestDFSStorageStateRecovery | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870885#comment-16870885 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 7m 46s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 17s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 54s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 22s{color} | {color:orange} root: The patch generated 10 new + 245 unchanged - 1 fixed = 255 total (was 246) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 51s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 49s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 45s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 48s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}205m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.diskbalancer.TestDiskBalancer | | | hadoop.hdfs.web.TestWebHdfsTimeouts | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12972682/HDFS-14313.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c01c410e47ff 4.4.0-143-generic #169~14.04.2-Ubuntu SMP Wed Feb 13 15:00:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/pers
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870832#comment-16870832 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 7s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 59s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 19s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 51s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 7s{color} | {color:orange} root: The patch generated 10 new + 245 unchanged - 1 fixed = 255 total (was 246) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 18s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 7s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 28s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 54s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}192m 32s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.diskbalancer.TestDiskBalancer | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12972681/HDFS-14313.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux aff07932e8a2 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | |
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870797#comment-16870797 ] Lisheng Sun commented on HDFS-14313: [~jojochuang] Thanks for your comment .I add synchronized to FsDatasetImpl#deepCopyReplica. {code:java} @Override public synchronized Set deepCopyReplica(String bpid) throws IOException { Set replicas = new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET : volumeMap.replicas(bpid)); return replicas; } {code} I don't use FsDatasetImpl#datasetLock , Because FsDatasetImpl#addBlockPool with datasetLock call FsDatasetImpl#deepCopyReplica in another Thread. Please continue to help review code. Please correct me if I am wrong. Thanks. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870773#comment-16870773 ] Lisheng Sun commented on HDFS-14313: [~jojochuang] Thanks for your comments. I add lock in FsDatasetImpl#deepCopyReplica. {code:java} @Override public Set deepCopyReplica(String bpid) throws IOException { try (AutoCloseableLock lock = datasetLock.acquire()) { Set replicas = new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET : volumeMap.replicas(bpid)); return replicas; } } {code} Please continue to help review this issue. Thank you. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869692#comment-16869692 ] Wei-Chiu Chuang commented on HDFS-14313: {code:title=ReplicaMap.java} /** * Get a collection of the replicas for given block pool * This method is not synchronized. It needs to be synchronized * externally using the lock, both for getting the replicas * values from the map and iterating over it. Mutex can be accessed using * {@link #getLock()} method. * * @param bpid block pool id * @return a collection of the replicas belonging to the block pool */ Collection replicas(String bpid) { return map.get(bpid); } {code} > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869322#comment-16869322 ] Lisheng Sun commented on HDFS-14313: [~jojochuang] Thans for your review. {{{quote}}} {{Additionally, what's the memory consumption look like? I assume it doubles DataNode memory usage.}} {{{quote}}}{{}} {{It cost memory far less than twice DataNode memory usage. }} {code:java} Collection replicaInfos = (Collection) fsDataset.deepCopyReplica(bpid); {code} {{Because }}ReplicaCachingGetSpaceUsed#replicaInfos is local variable and used for every block pool, and it also increases reference count. Increased memory consumption is about 4 bytes(8 bytes) * ReplicaInfos. {quote} synchronization: I find it hard to believe that FsDatasetImpl#deepCopyReplica() is not synchronized to avoid data race. {quote} I can't fully understand this synchronization problem. You worry about data race happen? > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869237#comment-16869237 ] Wei-Chiu Chuang commented on HDFS-14313: Thanks. I really like the benchmark. Do you happen to have the benchmark for 10 million block replicas? I guess for HBase clusters, DataNodes are not dense. But these days it is not uncommon to find DataNodes with 7 or even 10 million blocks. It's fine if you don't have that, and I just wanted to make sure. Additionally, what's the memory consumption look like? I assume it doubles DataNode memory usage. synchronization: I find it hard to believe that FsDatasetImpl#deepCopyReplica() is not synchronized to avoid data race. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869213#comment-16869213 ] Lisheng Sun commented on HDFS-14313: ping [~jojochuang] I don’t know I explain whether to clear. The feature has been running for a long time for XiaoMi HBase service and XiaoMi Data stream service that these need high performance requirements. Thanks. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865274#comment-16865274 ] Lisheng Sun commented on HDFS-14313: [~jojochuang] Thanks for your review. {quote}FsDatasetImpl#deepCopyReplica() needs to be synchronized on datasetLock. Additionally I am not sure the impact of creating creating millions of ReplicaInfo for dense DataNodes. That sounds quite heavy. {quote} I consider this point when developing the feature. And do test. ||Count(ReplicaInfo) of DataNode ||Time of synchronized on datasetLock|| |1039864|314ms| |343866|89ms| We get same result hundred milliseconds of millions ReplicaInfos in production environment. The feature has been running for a long time for XiaoMi HBase service and does not affect their performance. Thanks. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16864724#comment-16864724 ] Wei-Chiu Chuang commented on HDFS-14313: I think the motivation makes sense. A few things that caught my attention FsDatasetImpl#deepCopyReplica() needs to be synchronized on datasetLock. Additionally I am not sure the impact of creating creating millions of ReplicaInfo for dense DataNodes. That sounds quite heavy. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859745#comment-16859745 ] Lisheng Sun commented on HDFS-14313: [~joepe] [~zhuqi] [~zvenczel] [~candychencan] Could you please take a look? > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839140#comment-16839140 ] Lisheng Sun commented on HDFS-14313: Test failures are not caused by this patch. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16809530#comment-16809530 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 55s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 46s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 52s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 56s{color} | {color:orange} root: The patch generated 12 new + 245 unchanged - 1 fixed = 257 total (was 246) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 59s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 35s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 36s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 41s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}186m 56s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics | | | hadoop.tools.TestHdfsConfigFields | | | hadoop.hdfs.web.TestWebHdfsTimeouts | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14313 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12964800/HDFS-14313.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6a24c792fe5b 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 7b5b783 | | maven
[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808757#comment-16808757 ] Hadoop QA commented on HDFS-14313: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 36s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 2m 5s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 5s{color} | {color:red} root in the patch failed. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 18s{color} | {color:orange} root: The patch generated 15 new + 251 unchanged - 1 fixed = 266 total (was 252) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 37s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 3m 1s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 39s{color} | {color:red} hadoop-common-project/hadoop-common generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 22s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 10s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 37s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 90m 17s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-common-project/hadoop-common | | | Found reliance on default encoding in org.apache.hadoop.fs.CachingGetSpaceUsed.saveSpaceUsed():in org.apache.hadoop.fs.CachingGetSpaceUsed.saveSpaceUsed(): new java.io.FileWriter(File) At CachingGetSpaceUsed.java:[line 177] | | | org.apache.hadoop.fs.CachingGetSpaceUsed.saveSpaceUsed() may fail to clean up java.io.Writer on checked exception Obligation to clean up resource created at CachingGetSpaceUsed.java:up java.io.Writer on checked exception Obligation to clean up resource created at CachingGetSpaceUsed.java:[