[jira] [Updated] (HDFS-15394) Add all available fs.viewfs.overload.scheme.target..impl classes in core-default.xml bydefault.
[ https://issues.apache.org/jira/browse/HDFS-15394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15394: --- Fix Version/s: 3.3.1 > Add all available fs.viewfs.overload.scheme.target..impl classes in > core-default.xml bydefault. > --- > > Key: HDFS-15394 > URL: https://issues.apache.org/jira/browse/HDFS-15394 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: configuration, viewfs, viewfsOverloadScheme >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.3.1, 3.4.0 > > > This proposes to add all available > fs.viewfs.overload.scheme.target..impl classes in core-default.xml. > So, that users need not configure them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15387) FSUsage$DF should consider ViewFSOverloadScheme in processPath
[ https://issues.apache.org/jira/browse/HDFS-15387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15387: --- Fix Version/s: 3.3.1 > FSUsage$DF should consider ViewFSOverloadScheme in processPath > -- > > Key: HDFS-15387 > URL: https://issues.apache.org/jira/browse/HDFS-15387 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: viewfs >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Minor > Fix For: 3.3.1, 3.4.0 > > > Currently for calculating DF, processPath checks if it's ViewFS scheme, it > gets status from all fs and calculate. If not it will directly call > fs.getStatus. > Here we should treat ViewFSOverloadScheme also in ViewFS flow -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15321) Make DFSAdmin tool to work with ViewFSOverloadScheme
[ https://issues.apache.org/jira/browse/HDFS-15321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15321: --- Fix Version/s: 3.4.0 3.3.1 > Make DFSAdmin tool to work with ViewFSOverloadScheme > > > Key: HDFS-15321 > URL: https://issues.apache.org/jira/browse/HDFS-15321 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: dfsadmin, fs, viewfs >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.3.1, 3.4.0 > > > When we enable ViewFSOverLoadScheme and used hdfs scheme as overloaded > scheme, users work with hdfs uris. But here DFSAdmin expects the impl classe > to be DistribbuteFileSystem. If impl class is ViewFSoverloadScheme, it will > fail. > So, when impl is ViewFSoverloadScheme, we should get corresponding child hdfs > to make DFSAdmin to work. > This Jira makes the DFSAdmin to work with ViewFSoverloadScheme. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15389) DFSAdmin should close filesystem and dfsadmin -setBalancerBandwidth should work with ViewFSOverloadScheme
[ https://issues.apache.org/jira/browse/HDFS-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15389: --- Fix Version/s: 3.3.1 > DFSAdmin should close filesystem and dfsadmin -setBalancerBandwidth should > work with ViewFSOverloadScheme > -- > > Key: HDFS-15389 > URL: https://issues.apache.org/jira/browse/HDFS-15389 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: dfsadmin, viewfsOverloadScheme >Affects Versions: 3.2.1 >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Fix For: 3.3.1, 3.4.0 > > Attachments: HDFS-15389-01.patch > > > Two Issues Here : > Firstly Prior to HDFS-15321, When DFSAdmin was closed the FileSystem > associated with it was closed as part of close method, But post HDFS-15321, > the {{FileSystem}} isn't stored as part of {{FsShell}}, hence during close, > the FileSystem still stays and isn't close. > * This is the reason for failure of TestDFSHAAdmin > Second : {{DfsAdmin -setBalancerBandwidth}} doesn't work with > {{ViewFSOverloadScheme}} since the setBalancerBandwidth calls {{getFS()}} > rather than {{getDFS()}} which resolves the scheme in {{HDFS-15321}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15330) Document the ViewFSOverloadScheme details in ViewFS guide
[ https://issues.apache.org/jira/browse/HDFS-15330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15330: --- Fix Version/s: 3.3.1 > Document the ViewFSOverloadScheme details in ViewFS guide > - > > Key: HDFS-15330 > URL: https://issues.apache.org/jira/browse/HDFS-15330 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: viewfs, viewfsOverloadScheme >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.3.1, 3.4.0 > > > This Jira to track for documentation of ViewFSOverloadScheme usage guide. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15322) Make NflyFS to work when ViewFsOverloadScheme's scheme and target uris schemes are same.
[ https://issues.apache.org/jira/browse/HDFS-15322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15322: --- Fix Version/s: 3.3.1 > Make NflyFS to work when ViewFsOverloadScheme's scheme and target uris > schemes are same. > > > Key: HDFS-15322 > URL: https://issues.apache.org/jira/browse/HDFS-15322 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, nflyFs, viewfs, viewfsOverloadScheme >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.3.1, 3.4.0 > > > Currently Nfly mount link will not work when we use ViewFSOverloadScheme. > Because when when configured scheme is hdfs and target uris scheme also hdfs, > it will face the similar issue of looping what we discussed in design. We > need to use FsGetter to handle looping. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15306) Make mount-table to read from central place ( Let's say from HDFS)
[ https://issues.apache.org/jira/browse/HDFS-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15306: --- Fix Version/s: 3.3.1 > Make mount-table to read from central place ( Let's say from HDFS) > -- > > Key: HDFS-15306 > URL: https://issues.apache.org/jira/browse/HDFS-15306 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: configuration, hadoop-client >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.3.1, 3.4.0 > > > ViewFsOverloadScheme should be able to read mount-table.xml configuration > from remote servers. > Below are the discussed options in design doc: > # XInclude and HTTP Server ( including WebHDFS) > # Hadoop Compatible FS (*HCFS)* > a) Keep mount-table in Hadoop compatible FS > b)Read mount-table from Hadoop compatible FS using Xinclude > We prefer to have 1 and 2a. For 1 we don't need to modify any code. So, this > Jira can cover 2a. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15372) Files in snapshots no longer see attribute provider permissions
[ https://issues.apache.org/jira/browse/HDFS-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17137978#comment-17137978 ] Hudson commented on HDFS-15372: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18354 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18354/]) HDFS-15372. Files in snapshots no longer see attribute provider (weichiu: rev 730a39d1388548f22f76132a6734d61c24c3eb72) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodesInPath.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java > Files in snapshots no longer see attribute provider permissions > --- > > Key: HDFS-15372 > URL: https://issues.apache.org/jira/browse/HDFS-15372 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Fix For: 3.3.1, 3.4.0 > > Attachments: HDFS-15372.001.patch, HDFS-15372.002.patch, > HDFS-15372.003.patch, HDFS-15372.004.patch, HDFS-15372.005.patch > > > Given a cluster with an authorization provider configured (eg Sentry) and the > paths covered by the provider are snapshotable, there was a change in > behaviour in how the provider permissions and ACLs are applied to files in > snapshots between the 2.x branch and Hadoop 3.0. > Eg, if we have the snapshotable path /data, which is Sentry managed. The ACLs > below are provided by Sentry: > {code} > hadoop fs -getfacl -R /data > # file: /data > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > # file: /data/tab1 > # owner: hive > # group: hive > user::rwx > group::--- > group:flume:rwx > user:hive:rwx > group:hive:rwx > group:testgroup:rwx > mask::rwx > other::--x > /data/tab1 > {code} > After taking a snapshot, the files in the snapshot do not see the provider > permissions: > {code} > hadoop fs -getfacl -R /data/.snapshot > # file: /data/.snapshot > # owner: > # group: > user::rwx > group::rwx > other::rwx > # file: /data/.snapshot/snap1 > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > # file: /data/.snapshot/snap1/tab1 > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > {code} > However pre-Hadoop 3.0 (when the attribute provider etc was extensively > refactored) snapshots did get the provider permissions. > The reason is this code in FSDirectory.java which ultimately calls the > attribute provider and passes the path we want permissions for: > {code} > INodeAttributes getAttributes(INodesInPath iip) > throws IOException { > INode node = FSDirectory.resolveLastINode(iip); > int snapshot = iip.getPathSnapshotId(); > INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot); > UserGroupInformation ugi = NameNode.getRemoteUser(); > INodeAttributeProvider ap = this.getUserFilteredAttributeProvider(ugi); > if (ap != null) { > // permission checking sends the full components array including the > // first empty component for the root. however file status > // related calls are expected to strip out the root component according > // to TestINodeAttributeProvider. > byte[][] components = iip.getPathComponents(); > components = Arrays.copyOfRange(components, 1, components.length); > nodeAttrs = ap.getAttributes(components, nodeAttrs); > } > return nodeAttrs; > } > {code} > The line: > {code} > INode node = FSDirectory.resolveLastINode(iip); > {code} > Picks the last resolved Inode and if you then call node.getPathComponents, > for a path like '/data/.snapshot/snap1/tab1' it will return /data/tab1. It > resolves the snapshot path to its original location, but its still the > snapshot inode. > However the logic passes 'iip.getPathComponents' which returns > "/user/.snapshot/snap1/tab" to the provider. > The pre Hadoop 3.0 code passes the inode directly to the provider, and hence > it only ever sees the path as "/user/data/tab1". > It is debatable which path should be passed to the provider - > /user/.snapshot/snap1/tab or /data/tab1 in the case of snapshots. However as > the behaviour has changed I feel we should ensure the old behaviour is > retained. > It would also be fairly easy to provide a config switch so the provider gets > the full snapshot path or the resolved path. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail:
[jira] [Updated] (HDFS-15372) Files in snapshots no longer see attribute provider permissions
[ https://issues.apache.org/jira/browse/HDFS-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-15372: --- Fix Version/s: 3.4.0 3.3.1 Resolution: Fixed Status: Resolved (was: Patch Available) Resolving the jira. Thanks [~sodonnell] for finding the issue and fixing it, and [~hemanthboyina] for reviewing the patch. > Files in snapshots no longer see attribute provider permissions > --- > > Key: HDFS-15372 > URL: https://issues.apache.org/jira/browse/HDFS-15372 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Fix For: 3.3.1, 3.4.0 > > Attachments: HDFS-15372.001.patch, HDFS-15372.002.patch, > HDFS-15372.003.patch, HDFS-15372.004.patch, HDFS-15372.005.patch > > > Given a cluster with an authorization provider configured (eg Sentry) and the > paths covered by the provider are snapshotable, there was a change in > behaviour in how the provider permissions and ACLs are applied to files in > snapshots between the 2.x branch and Hadoop 3.0. > Eg, if we have the snapshotable path /data, which is Sentry managed. The ACLs > below are provided by Sentry: > {code} > hadoop fs -getfacl -R /data > # file: /data > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > # file: /data/tab1 > # owner: hive > # group: hive > user::rwx > group::--- > group:flume:rwx > user:hive:rwx > group:hive:rwx > group:testgroup:rwx > mask::rwx > other::--x > /data/tab1 > {code} > After taking a snapshot, the files in the snapshot do not see the provider > permissions: > {code} > hadoop fs -getfacl -R /data/.snapshot > # file: /data/.snapshot > # owner: > # group: > user::rwx > group::rwx > other::rwx > # file: /data/.snapshot/snap1 > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > # file: /data/.snapshot/snap1/tab1 > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > {code} > However pre-Hadoop 3.0 (when the attribute provider etc was extensively > refactored) snapshots did get the provider permissions. > The reason is this code in FSDirectory.java which ultimately calls the > attribute provider and passes the path we want permissions for: > {code} > INodeAttributes getAttributes(INodesInPath iip) > throws IOException { > INode node = FSDirectory.resolveLastINode(iip); > int snapshot = iip.getPathSnapshotId(); > INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot); > UserGroupInformation ugi = NameNode.getRemoteUser(); > INodeAttributeProvider ap = this.getUserFilteredAttributeProvider(ugi); > if (ap != null) { > // permission checking sends the full components array including the > // first empty component for the root. however file status > // related calls are expected to strip out the root component according > // to TestINodeAttributeProvider. > byte[][] components = iip.getPathComponents(); > components = Arrays.copyOfRange(components, 1, components.length); > nodeAttrs = ap.getAttributes(components, nodeAttrs); > } > return nodeAttrs; > } > {code} > The line: > {code} > INode node = FSDirectory.resolveLastINode(iip); > {code} > Picks the last resolved Inode and if you then call node.getPathComponents, > for a path like '/data/.snapshot/snap1/tab1' it will return /data/tab1. It > resolves the snapshot path to its original location, but its still the > snapshot inode. > However the logic passes 'iip.getPathComponents' which returns > "/user/.snapshot/snap1/tab" to the provider. > The pre Hadoop 3.0 code passes the inode directly to the provider, and hence > it only ever sees the path as "/user/data/tab1". > It is debatable which path should be passed to the provider - > /user/.snapshot/snap1/tab or /data/tab1 in the case of snapshots. However as > the behaviour has changed I feel we should ensure the old behaviour is > retained. > It would also be fairly easy to provide a config switch so the provider gets > the full snapshot path or the resolved path. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15372) Files in snapshots no longer see attribute provider permissions
[ https://issues.apache.org/jira/browse/HDFS-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17137975#comment-17137975 ] Wei-Chiu Chuang commented on HDFS-15372: Committed the change to trunk and branch-3.3. There are pretty big conflicts cherrypicking to the branch-3.2 due to HDFS-14743. > Files in snapshots no longer see attribute provider permissions > --- > > Key: HDFS-15372 > URL: https://issues.apache.org/jira/browse/HDFS-15372 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-15372.001.patch, HDFS-15372.002.patch, > HDFS-15372.003.patch, HDFS-15372.004.patch, HDFS-15372.005.patch > > > Given a cluster with an authorization provider configured (eg Sentry) and the > paths covered by the provider are snapshotable, there was a change in > behaviour in how the provider permissions and ACLs are applied to files in > snapshots between the 2.x branch and Hadoop 3.0. > Eg, if we have the snapshotable path /data, which is Sentry managed. The ACLs > below are provided by Sentry: > {code} > hadoop fs -getfacl -R /data > # file: /data > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > # file: /data/tab1 > # owner: hive > # group: hive > user::rwx > group::--- > group:flume:rwx > user:hive:rwx > group:hive:rwx > group:testgroup:rwx > mask::rwx > other::--x > /data/tab1 > {code} > After taking a snapshot, the files in the snapshot do not see the provider > permissions: > {code} > hadoop fs -getfacl -R /data/.snapshot > # file: /data/.snapshot > # owner: > # group: > user::rwx > group::rwx > other::rwx > # file: /data/.snapshot/snap1 > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > # file: /data/.snapshot/snap1/tab1 > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > {code} > However pre-Hadoop 3.0 (when the attribute provider etc was extensively > refactored) snapshots did get the provider permissions. > The reason is this code in FSDirectory.java which ultimately calls the > attribute provider and passes the path we want permissions for: > {code} > INodeAttributes getAttributes(INodesInPath iip) > throws IOException { > INode node = FSDirectory.resolveLastINode(iip); > int snapshot = iip.getPathSnapshotId(); > INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot); > UserGroupInformation ugi = NameNode.getRemoteUser(); > INodeAttributeProvider ap = this.getUserFilteredAttributeProvider(ugi); > if (ap != null) { > // permission checking sends the full components array including the > // first empty component for the root. however file status > // related calls are expected to strip out the root component according > // to TestINodeAttributeProvider. > byte[][] components = iip.getPathComponents(); > components = Arrays.copyOfRange(components, 1, components.length); > nodeAttrs = ap.getAttributes(components, nodeAttrs); > } > return nodeAttrs; > } > {code} > The line: > {code} > INode node = FSDirectory.resolveLastINode(iip); > {code} > Picks the last resolved Inode and if you then call node.getPathComponents, > for a path like '/data/.snapshot/snap1/tab1' it will return /data/tab1. It > resolves the snapshot path to its original location, but its still the > snapshot inode. > However the logic passes 'iip.getPathComponents' which returns > "/user/.snapshot/snap1/tab" to the provider. > The pre Hadoop 3.0 code passes the inode directly to the provider, and hence > it only ever sees the path as "/user/data/tab1". > It is debatable which path should be passed to the provider - > /user/.snapshot/snap1/tab or /data/tab1 in the case of snapshots. However as > the behaviour has changed I feel we should ensure the old behaviour is > retained. > It would also be fairly easy to provide a config switch so the provider gets > the full snapshot path or the resolved path. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15372) Files in snapshots no longer see attribute provider permissions
[ https://issues.apache.org/jira/browse/HDFS-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17137971#comment-17137971 ] Wei-Chiu Chuang commented on HDFS-15372: +1 from me. Will commit later. > Files in snapshots no longer see attribute provider permissions > --- > > Key: HDFS-15372 > URL: https://issues.apache.org/jira/browse/HDFS-15372 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-15372.001.patch, HDFS-15372.002.patch, > HDFS-15372.003.patch, HDFS-15372.004.patch, HDFS-15372.005.patch > > > Given a cluster with an authorization provider configured (eg Sentry) and the > paths covered by the provider are snapshotable, there was a change in > behaviour in how the provider permissions and ACLs are applied to files in > snapshots between the 2.x branch and Hadoop 3.0. > Eg, if we have the snapshotable path /data, which is Sentry managed. The ACLs > below are provided by Sentry: > {code} > hadoop fs -getfacl -R /data > # file: /data > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > # file: /data/tab1 > # owner: hive > # group: hive > user::rwx > group::--- > group:flume:rwx > user:hive:rwx > group:hive:rwx > group:testgroup:rwx > mask::rwx > other::--x > /data/tab1 > {code} > After taking a snapshot, the files in the snapshot do not see the provider > permissions: > {code} > hadoop fs -getfacl -R /data/.snapshot > # file: /data/.snapshot > # owner: > # group: > user::rwx > group::rwx > other::rwx > # file: /data/.snapshot/snap1 > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > # file: /data/.snapshot/snap1/tab1 > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > {code} > However pre-Hadoop 3.0 (when the attribute provider etc was extensively > refactored) snapshots did get the provider permissions. > The reason is this code in FSDirectory.java which ultimately calls the > attribute provider and passes the path we want permissions for: > {code} > INodeAttributes getAttributes(INodesInPath iip) > throws IOException { > INode node = FSDirectory.resolveLastINode(iip); > int snapshot = iip.getPathSnapshotId(); > INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot); > UserGroupInformation ugi = NameNode.getRemoteUser(); > INodeAttributeProvider ap = this.getUserFilteredAttributeProvider(ugi); > if (ap != null) { > // permission checking sends the full components array including the > // first empty component for the root. however file status > // related calls are expected to strip out the root component according > // to TestINodeAttributeProvider. > byte[][] components = iip.getPathComponents(); > components = Arrays.copyOfRange(components, 1, components.length); > nodeAttrs = ap.getAttributes(components, nodeAttrs); > } > return nodeAttrs; > } > {code} > The line: > {code} > INode node = FSDirectory.resolveLastINode(iip); > {code} > Picks the last resolved Inode and if you then call node.getPathComponents, > for a path like '/data/.snapshot/snap1/tab1' it will return /data/tab1. It > resolves the snapshot path to its original location, but its still the > snapshot inode. > However the logic passes 'iip.getPathComponents' which returns > "/user/.snapshot/snap1/tab" to the provider. > The pre Hadoop 3.0 code passes the inode directly to the provider, and hence > it only ever sees the path as "/user/data/tab1". > It is debatable which path should be passed to the provider - > /user/.snapshot/snap1/tab or /data/tab1 in the case of snapshots. However as > the behaviour has changed I feel we should ensure the old behaviour is > retained. > It would also be fairly easy to provide a config switch so the provider gets > the full snapshot path or the resolved path. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15289) Allow viewfs mounts with HDFS/HCFS scheme and centralized mount table
[ https://issues.apache.org/jira/browse/HDFS-15289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15289: --- Description: ViewFS provides flexibility to mount different filesystem types with mount points configuration table. This approach is solving the scalability problems, but users need to reconfigure the filesystem to ViewFS and to its scheme. This will be problematic in the case of paths persisted in meta stores, ex: Hive. In systems like Hive, it will store uris in meta store. So, changing the file system scheme will create a burden to upgrade/recreate meta stores. In our experience many users are not ready to change that. Router based federation is another implementation to provide coordinated mount points for HDFS federation clusters. Even though this provides flexibility to handle mount points easily, this will not allow other(non-HDFS) file systems to mount. So, this does not solve the purpose when users want to mount external(non-HDFS) filesystems. So, the problem here is: Even though many users want to adapt to the scalable fs options available, technical challenges of changing schemes (ex: in meta stores) in deployments are obstructing them. So, we propose to allow hdfs scheme in ViewFS like client side mount system and provision user to create mount links without changing URI paths. I will upload detailed design doc shortly. was: ViewFS provides flexibility to mount different filesystem types with mount points configuration table. Additionally viewFS provides flexibility to configure any fs (not only HDFS) scheme in mount table mapping. This approach is solving the scalability problems, but users need to reconfigure the filesystem to ViewFS and to its scheme. This will be problematic in the case of paths persisted in meta stores, ex: Hive. In systems like Hive, it will store uris in meta store. So, changing the file system scheme will create a burden to upgrade/recreate meta stores. In our experience many users are not ready to change that. Router based federation is another implementation to provide coordinated mount points for HDFS federation clusters. Even though this provides flexibility to handle mount points easily, this will not allow other(non-HDFS) file systems to mount. So, this does not solve the purpose when users want to mount external(non-HDFS) filesystems. So, the problem here is: Even though many users want to adapt to the scalable fs options available, technical challenges of changing schemes (ex: in meta stores) in deployments are obstructing them. So, we propose to allow hdfs scheme in ViewFS like client side mount system and provision user to create mount links without changing URI paths. I will upload detailed design doc shortly. > Allow viewfs mounts with HDFS/HCFS scheme and centralized mount table > - > > Key: HDFS-15289 > URL: https://issues.apache.org/jira/browse/HDFS-15289 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 3.2.0 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Attachments: ViewFSOverloadScheme - V1.0.pdf, ViewFSOverloadScheme.png > > > ViewFS provides flexibility to mount different filesystem types with mount > points configuration table. This approach is solving the scalability > problems, but users need to reconfigure the filesystem to ViewFS and to its > scheme. This will be problematic in the case of paths persisted in meta > stores, ex: Hive. In systems like Hive, it will store uris in meta store. So, > changing the file system scheme will create a burden to upgrade/recreate meta > stores. In our experience many users are not ready to change that. > Router based federation is another implementation to provide coordinated > mount points for HDFS federation clusters. Even though this provides > flexibility to handle mount points easily, this will not allow > other(non-HDFS) file systems to mount. So, this does not solve the purpose > when users want to mount external(non-HDFS) filesystems. > So, the problem here is: Even though many users want to adapt to the scalable > fs options available, technical challenges of changing schemes (ex: in meta > stores) in deployments are obstructing them. > So, we propose to allow hdfs scheme in ViewFS like client side mount system > and provision user to create mount links without changing URI paths. > I will upload detailed design doc shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Commented] (HDFS-8538) Change the default volume choosing policy to AvailableSpaceVolumeChoosingPolicy
[ https://issues.apache.org/jira/browse/HDFS-8538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136810#comment-17136810 ] Stephen O'Donnell commented on HDFS-8538: - As this is an old Jira and there was some objections to this change historically, I asked on the mailing list to see if the community was happy to make this change. To add some context, in the 5 years since this change was suggested, at Cloudera we have seen about 1000 clusters running with Available Space enabled, and we have not seen any issues caused by it. It feels like this policy should be the default, as we have to change it more often than not. >From the mailing list, I got 3 +1's to move forward with making >AvailableSpaceVolumeChoosingPolicy the default from 3.4.0 and no objections. Archives are at: http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/202004.mbox/browser and http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/202005.mbox/browser With that in mind I will move forward on this in a week or so, unless anyone objects here. > Change the default volume choosing policy to > AvailableSpaceVolumeChoosingPolicy > --- > > Key: HDFS-8538 > URL: https://issues.apache.org/jira/browse/HDFS-8538 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.0 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Major > Attachments: hdfs-8538.001.patch > > > For datanodes with different sized disks, they almost always want the > available space policy. Users with homogenous disks are unaffected. > Since this code has baked for a while, let's change it to be the default. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15406) Improve the speed of Datanode Block Scan
[ https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136797#comment-17136797 ] Stephen O'Donnell commented on HDFS-15406: -- Going from 3m20 to 52 seconds with such a simple change is a big win. It would be great to see how long the sort call takes on this line: {code} Collections.sort(bl); // Sort based on blockId {code} My feeling is that we can just drop this call due to the already sorted FoldedTreeSet, but it might be best to create a new method on FSDatasetSPI called getSortedFinalizedBlocks to make it obvious, as there are possibly other implementations of FSDatasetImpl which may not have the blocks already sorted. If you don't want to investigate that as part of this Jira, we can create a sub-jira for it. It might be a trivial change to improve things a bit further, perhaps saving about another 5 seconds. > Improve the speed of Datanode Block Scan > > > Key: HDFS-15406 > URL: https://issues.apache.org/jira/browse/HDFS-15406 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-15406.001.patch > > > In our customer cluster we have approx 10M blocks in one datanode > the Datanode to scans all the blocks , it has taken nearly 5mins > {code:java} > 2020-06-10 12:17:06,869 | INFO | > java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty > queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: > 11149530, missing metadata files:472, missing block files:472, missing blocks > in memory:0, mismatched blocks:0 | DirectoryScanner.java:473 > 2020-06-10 12:17:06,869 | WARN | > java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty > queue] | Lock held time above threshold: lock identifier: > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl > lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) > org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) > org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133) > org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84) > org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320) > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > java.lang.Thread.run(Thread.java:748) > | InstrumentedLock.java:143 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15406) Improve the speed of Datanode Block Scan
[ https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136730#comment-17136730 ] hemanthboyina commented on HDFS-15406: -- thanks [~sodonnell] for the comment {quote} # After you started caching `getBaseURI()` did it improve the runtime of both the getDiskReport() step and compare with in-memory step?{quote} yes , on caching the getBaseURI the time taken is improved in both places , more details below {quote}2. Looking at the code on trunk, I don't think we create any scanInfo objects under the lock in the compare sections unless there is a difference. If this change improved your runtime under the lock from 6m -> 52 seconds, is this because there is a large number of differences between disk and memory on your cluster for some reason? {quote} i think you are referring creation of scan info objects here because for creating ScanInfo object we use vol.getBaseUri() , Even we do not create scaninfo objects we internally call getBaseURI 3 times inside the lock by info.getBlockFile() , info.getGenStamp() , memBlock.compareWith(info) , so if we have large number of differences between disk and in memory calls to getBaseURI will be even more , So by caching getBaseURI we saved atleast 3 times inside lock and once in outside lock , so there was huge decrease in lock time held ,We tried creating 11M blocks in our independent cluster and we could see lock time is held for 3min 20 sec before caching and upon caching the getBaseURI lock time was52sec {quote}3. Did you do capture any profiles (flame chart or debug log messages) to see how long each part of the code under the lock runs for? I am interested in these lines in DirectoryScanner#scan(): {quote} i didn't capture profiles , but i could see dataset.getFinalizedBlocks(bpid) in stacktrace as it has taken more than 600ms on every iteration > Improve the speed of Datanode Block Scan > > > Key: HDFS-15406 > URL: https://issues.apache.org/jira/browse/HDFS-15406 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-15406.001.patch > > > In our customer cluster we have approx 10M blocks in one datanode > the Datanode to scans all the blocks , it has taken nearly 5mins > {code:java} > 2020-06-10 12:17:06,869 | INFO | > java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty > queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: > 11149530, missing metadata files:472, missing block files:472, missing blocks > in memory:0, mismatched blocks:0 | DirectoryScanner.java:473 > 2020-06-10 12:17:06,869 | WARN | > java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty > queue] | Lock held time above threshold: lock identifier: > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl > lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) > org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) > org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133) > org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84) > org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320) > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > java.lang.Thread.run(Thread.java:748) > | InstrumentedLock.java:143 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15406) Improve the speed of Datanode Block Scan
[ https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136570#comment-17136570 ] Stephen O'Donnell commented on HDFS-15406: -- Thinking about this some more, as `dataset.getFinalizedBlocks(bpid);` makes a new copy of all the finalized blocks in the block pool, do we even need to hold the DN lock while we compare the differences between on disk and in memory? From the scan step, we have captured a snapshot of what is on disk. After calling `dataset.getFinalizedBlocks(bpid);` we have taken a snapshot of in memory. The two snapshots are never 100% in sync as things are always changing as the disk is scanned. We are only comparing finalized blocks, so they should not really change: * If a block is deleted after our snapshot, our snapshot will not see it and that is OK. * A finalized block could be appended. If that happens both the genstamp and length will change, but that should be handled by reconcile when it calls `FSDatasetImpl.checkAndUpdate()`, and there is nothing stopping blocks being appended after they have been scanned from disk, but before they have been compared with memory. I am not 100% sure about this, but my suspicion is that we can do a lot of this work outside of the lock ad checkAndUpdate() re-checks any differences later under the lock on a block by block basis. > Improve the speed of Datanode Block Scan > > > Key: HDFS-15406 > URL: https://issues.apache.org/jira/browse/HDFS-15406 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-15406.001.patch > > > In our customer cluster we have approx 10M blocks in one datanode > the Datanode to scans all the blocks , it has taken nearly 5mins > {code:java} > 2020-06-10 12:17:06,869 | INFO | > java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty > queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: > 11149530, missing metadata files:472, missing block files:472, missing blocks > in memory:0, mismatched blocks:0 | DirectoryScanner.java:473 > 2020-06-10 12:17:06,869 | WARN | > java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty > queue] | Lock held time above threshold: lock identifier: > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl > lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) > org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) > org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133) > org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84) > org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320) > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > java.lang.Thread.run(Thread.java:748) > | InstrumentedLock.java:143 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15406) Improve the speed of Datanode Block Scan
[ https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136552#comment-17136552 ] Stephen O'Donnell edited comment on HDFS-15406 at 6/16/20, 11:09 AM: - [~hemanthboyina], this is a good find. Can I just clarify: 1. After you started caching `getBaseURI()` did it improve the runtime of both the getDiskReport() step and compare with in-memory step? 2. Looking at the code on trunk, I don't think we create any scanInfo objects under the lock in the compare sections unless there is a difference. If this change improved your runtime under the lock from 6m -> 52 seconds, is this because there is a large number of differences between disk and memory on your cluster for some reason? 3. Did you do capture any profiles (flame chart or debug log messages) to see how long each part of the code under the lock runs for? I am interested in these lines in DirectoryScanner#scan(): {code} final List bl = dataset.getFinalizedBlocks(bpid); Collections.sort(bl); // Sort based on blockId {code} You mentioned this DN has 11M blocks, so I imagine forming this list of ReplicaInfo and then sorting it takes some time, several seconds at least. Based on tests I did here, sorting 11M blocks would probably take about 5 seconds: https://issues.apache.org/jira/browse/HDFS-15140?focusedCommentId=17023077=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17023077 Inside the replicaMap, the ReplicaInfo are stored in a FoldedTreeSet, which is a sorted structure. We should be able to get an iterator on it and avoid the need to create this new ReplicaInfo list and sort it. It would require some changes to the subsequent code if we used an Iterator, but I suspect we can just drop the sort with no further changes. If you are able to test this on your real datanode, it would be interesting to see how long it takes for the getFinalizedBlocks and then for the sort to see if this takes shaves some more time off under the lock. was (Author: sodonnell): [~hemanthboyina], this is a good find. Can I just clarify: 1. After you started caching `getBaseURI()` did it improve the runtime of both the getDiskReport() step and compare with in-memory step? 2. Looking at the code on trunk, I don't think we create any scanInfo objects under the lock in the compare sections unless there is a difference. If this change improved your runtime under the lock from 6m -> 52 seconds, is this because there is a large number of differences between disk and memory on your cluster for some reason? 3. Did you do capture any profiles (flame chart or debug log messages) to see how long each part of the code under the lock runs for? I am interested in these lines: {code} final List bl = dataset.getFinalizedBlocks(bpid); Collections.sort(bl); // Sort based on blockId {code} You mentioned this DN has 11M blocks, so I imagine forming this list of ReplicaInfo and then sorting it takes some time, several seconds at least. Based on tests I did here, sorting 11M blocks would probably take about 5 seconds: https://issues.apache.org/jira/browse/HDFS-15140?focusedCommentId=17023077=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17023077 Inside the replicaMap, the ReplicaInfo are stored in a FoldedTreeSet, which is a sorted structure. We should be able to get an iterator on it and avoid the need to create this new ReplicaInfo list and sort it. It would require some changes to the subsequent code if we used an Iterator, but I suspect we can just drop the sort with no further changes. If you are able to test this on your real datanode, it would be interesting to see how long it takes for the getFinalizedBlocks and then for the sort to see if this takes shaves some more time off under the lock. > Improve the speed of Datanode Block Scan > > > Key: HDFS-15406 > URL: https://issues.apache.org/jira/browse/HDFS-15406 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-15406.001.patch > > > In our customer cluster we have approx 10M blocks in one datanode > the Datanode to scans all the blocks , it has taken nearly 5mins > {code:java} > 2020-06-10 12:17:06,869 | INFO | > java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty > queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: > 11149530, missing metadata files:472, missing block files:472, missing blocks > in memory:0, mismatched blocks:0 | DirectoryScanner.java:473 > 2020-06-10 12:17:06,869 | WARN | > java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty > queue] | Lock held time above threshold: lock
[jira] [Commented] (HDFS-15406) Improve the speed of Datanode Block Scan
[ https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136552#comment-17136552 ] Stephen O'Donnell commented on HDFS-15406: -- [~hemanthboyina], this is a good find. Can I just clarify: 1. After you started caching `getBaseURI()` did it improve the runtime of both the getDiskReport() step and compare with in-memory step? 2. Looking at the code on trunk, I don't think we create any scanInfo objects under the lock in the compare sections unless there is a difference. If this change improved your runtime under the lock from 6m -> 52 seconds, is this because there is a large number of differences between disk and memory on your cluster for some reason? 3. Did you do capture any profiles (flame chart or debug log messages) to see how long each part of the code under the lock runs for? I am interested in these lines: {code} final List bl = dataset.getFinalizedBlocks(bpid); Collections.sort(bl); // Sort based on blockId {code} You mentioned this DN has 11M blocks, so I imagine forming this list of ReplicaInfo and then sorting it takes some time, several seconds at least. Based on tests I did here, sorting 11M blocks would probably take about 5 seconds: https://issues.apache.org/jira/browse/HDFS-15140?focusedCommentId=17023077=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17023077 Inside the replicaMap, the ReplicaInfo are stored in a FoldedTreeSet, which is a sorted structure. We should be able to get an iterator on it and avoid the need to create this new ReplicaInfo list and sort it. It would require some changes to the subsequent code if we used an Iterator, but I suspect we can just drop the sort with no further changes. If you are able to test this on your real datanode, it would be interesting to see how long it takes for the getFinalizedBlocks and then for the sort to see if this takes shaves some more time off under the lock. > Improve the speed of Datanode Block Scan > > > Key: HDFS-15406 > URL: https://issues.apache.org/jira/browse/HDFS-15406 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-15406.001.patch > > > In our customer cluster we have approx 10M blocks in one datanode > the Datanode to scans all the blocks , it has taken nearly 5mins > {code:java} > 2020-06-10 12:17:06,869 | INFO | > java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty > queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: > 11149530, missing metadata files:472, missing block files:472, missing blocks > in memory:0, mismatched blocks:0 | DirectoryScanner.java:473 > 2020-06-10 12:17:06,869 | WARN | > java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty > queue] | Lock held time above threshold: lock identifier: > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl > lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) > org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) > org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133) > org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84) > org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320) > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > java.lang.Thread.run(Thread.java:748) > | InstrumentedLock.java:143 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15346) DistCpFedBalance implementation
[ https://issues.apache.org/jira/browse/HDFS-15346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136480#comment-17136480 ] Yiqun Lin commented on HDFS-15346: -- LGTM, +1. Will commit this the day after tomorrow once there is no other comment. > DistCpFedBalance implementation > --- > > Key: HDFS-15346 > URL: https://issues.apache.org/jira/browse/HDFS-15346 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Attachments: HDFS-15346.001.patch, HDFS-15346.002.patch, > HDFS-15346.003.patch, HDFS-15346.004.patch, HDFS-15346.005.patch, > HDFS-15346.006.patch, HDFS-15346.007.patch, HDFS-15346.008.patch, > HDFS-15346.009.patch, HDFS-15346.010.patch, HDFS-15346.011.patch, > HDFS-15346.012.patch > > > Patch in HDFS-15294 is too big to review so we split it into 2 patches. This > is the second one. Detail can be found at HDFS-15294. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15346) DistCpFedBalance implementation
[ https://issues.apache.org/jira/browse/HDFS-15346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136433#comment-17136433 ] Hadoop QA commented on HDFS-15346: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 15 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 4s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 21s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 34s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 5m 16s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 31s{color} | {color:blue} branch/hadoop-project no findbugs output file (findbugsXml.xml) {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 31s{color} | {color:blue} branch/hadoop-assemblies no findbugs output file (findbugsXml.xml) {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 32s{color} | {color:blue} branch/hadoop-tools/hadoop-tools-dist no findbugs output file (findbugsXml.xml) {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 6m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 34s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 6s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 39s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 14s{color} | {color:green} the patch passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 32s{color} | {color:blue} hadoop-project has no data from findbugs {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 32s{color} | {color:blue} hadoop-assemblies has no data from findbugs {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 33s{color} | {color:blue} hadoop-tools/hadoop-tools-dist has no