[jira] [Assigned] (HDFS-17031) Reduce some repeated codes in RouterRpcServer
[ https://issues.apache.org/jira/browse/HDFS-17031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang reassigned HDFS-17031: Assignee: Chengwei Wang > Reduce some repeated codes in RouterRpcServer > - > > Key: HDFS-17031 > URL: https://issues.apache.org/jira/browse/HDFS-17031 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Minor > Labels: pull-request-available > > Reduce repeated codes : > > {code:java} > if (subclusterResolver instanceof MountTableResolver) { > try { > MountTableResolver mountTable = (MountTableResolver)subclusterResolver; > MountTable entry = mountTable.getMountPoint(path); > // check logic > } catch (IOException e) { > LOG.error("Cannot get mount point", e); > } > } > return false; {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-17031) Reduce some repeated codes in RouterRpcServer
Chengwei Wang created HDFS-17031: Summary: Reduce some repeated codes in RouterRpcServer Key: HDFS-17031 URL: https://issues.apache.org/jira/browse/HDFS-17031 Project: Hadoop HDFS Issue Type: Improvement Components: rbf Reporter: Chengwei Wang Reduce repeated codes : {code:java} if (subclusterResolver instanceof MountTableResolver) { try { MountTableResolver mountTable = (MountTableResolver)subclusterResolver; MountTable entry = mountTable.getMountPoint(path); // check logic } catch (IOException e) { LOG.error("Cannot get mount point", e); } } return false; {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16985) delete local block file when FileNotFoundException occurred may lead to missing block.
[ https://issues.apache.org/jira/browse/HDFS-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713367#comment-17713367 ] Chengwei Wang commented on HDFS-16985: -- create pr [https://github.com/apache/hadoop/pull/5564] please take a look [~weichiu] > delete local block file when FileNotFoundException occurred may lead to > missing block. > -- > > Key: HDFS-16985 > URL: https://issues.apache.org/jira/browse/HDFS-16985 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Labels: pull-request-available > > We encounterd several missing-block problem in our production cluster which > hdfs running on AWS EC2 + EBS. > The root cause: > # the block remains only 1 replication left and hasn't been reconstruction > # DN checks block file existing when BlockSender construction > # the EBS checking failed and throw FileNotFoundException (EBS may be in > fault condition) > # DN invalidateBlock and schedule block async deletion > # EBS already back to normal when DN do delete block > # the block file be delete permanently and can't be recovered -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16985) delete local block file when FileNotFoundException occurred may lead to missing block.
[ https://issues.apache.org/jira/browse/HDFS-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713362#comment-17713362 ] Chengwei Wang commented on HDFS-16985: -- [~weichiu] thanks for comment. I have tried to explain the root cause in description. The code in BlockSender {code:java} try { // check block file // check block meta file } catch (FileNotFoundException e) { if ((e.getMessage() != null) && !(e.getMessage() .contains("Too many open files"))) { // The replica is on its volume map but not on disk datanode .notifyNamenodeDeletedBlock(block, replica.getStorageUuid()); // local block file would be delete async datanode.data.invalidate(block.getBlockPoolId(), new Block[] {block.getLocalBlock()}); } throw e; } finally { if (!keepMetaInOpen) { IOUtils.closeStream(metaIn); } } {code} > delete local block file when FileNotFoundException occurred may lead to > missing block. > -- > > Key: HDFS-16985 > URL: https://issues.apache.org/jira/browse/HDFS-16985 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > We encounterd several missing-block problem in our production cluster which > hdfs running on AWS EC2 + EBS. > The root cause: > # the block remains only 1 replication left and hasn't been reconstruction > # DN checks block file existing when BlockSender construction > # the EBS checking failed and throw FileNotFoundException (EBS may be in > fault condition) > # DN invalidateBlock and schedule block async deletion > # EBS already back to normal when DN do delete block > # the block file be delete permanently and can't be recovered -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16985) delete local block file when FileNotFoundException occurred may lead to missing block.
[ https://issues.apache.org/jira/browse/HDFS-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16985: - Description: We encounterd several missing-block problem in our production cluster which hdfs running on AWS EC2 + EBS. The root cause: # the block remains only 1 replication left and hasn't been reconstruction # DN checks block file existing when BlockSender construction # the EBS checking failed and throw FileNotFoundException (EBS may be in fault condition) # DN invalidateBlock and schedule block async deletion # EBS already back to normal when DN do delete block # the block file be delete permanently and can't be recovered was: We encounterd several missing-block problem in our production cluster which hdfs running on AWS EC2 + EBS. The root cause: # the block remains only 1 replication and hasn't do # check block file existing when BlockSender construction # the EBS checking failed and throw FileNotFoundException # DN invalidateBlock and schedule delete block async > delete local block file when FileNotFoundException occurred may lead to > missing block. > -- > > Key: HDFS-16985 > URL: https://issues.apache.org/jira/browse/HDFS-16985 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > We encounterd several missing-block problem in our production cluster which > hdfs running on AWS EC2 + EBS. > The root cause: > # the block remains only 1 replication left and hasn't been reconstruction > # DN checks block file existing when BlockSender construction > # the EBS checking failed and throw FileNotFoundException (EBS may be in > fault condition) > # DN invalidateBlock and schedule block async deletion > # EBS already back to normal when DN do delete block > # the block file be delete permanently and can't be recovered -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16985) delete local block file when FileNotFoundException occurred may lead to missing block.
[ https://issues.apache.org/jira/browse/HDFS-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16985: - Description: We encounterd several missing-block problem in our production cluster which hdfs running on AWS EC2 + EBS. The root cause: # the block remains only 1 replication and hasn't do # check block file existing when BlockSender construction # the EBS checking failed and throw FileNotFoundException # DN invalidateBlock and schedule delete block async was: We encounterd several missing-block problem in our production cluster which hdfs running on AWS EC2 + EBS. The root cause: # check block file existing when BlockSender construction # the EBS checking failed and throw # > delete local block file when FileNotFoundException occurred may lead to > missing block. > -- > > Key: HDFS-16985 > URL: https://issues.apache.org/jira/browse/HDFS-16985 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > We encounterd several missing-block problem in our production cluster which > hdfs running on AWS EC2 + EBS. > The root cause: > # the block remains only 1 replication and hasn't do > # check block file existing when BlockSender construction > # the EBS checking failed and throw FileNotFoundException > # DN invalidateBlock and schedule delete block async -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16985) delete local block file when FileNotFoundException occurred may lead to missing block.
[ https://issues.apache.org/jira/browse/HDFS-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16985: - Description: We encounterd several missing-block problem in our production cluster which hdfs running on AWS EC2 + EBS. The root cause: # check block file existing when BlockSender construction # the EBS checking failed and throw # > delete local block file when FileNotFoundException occurred may lead to > missing block. > -- > > Key: HDFS-16985 > URL: https://issues.apache.org/jira/browse/HDFS-16985 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > We encounterd several missing-block problem in our production cluster which > hdfs running on AWS EC2 + EBS. > The root cause: > # check block file existing when BlockSender construction > # the EBS checking failed and throw > # -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16985) delete local block file when FileNotFoundException occurred may lead to missing block.
Chengwei Wang created HDFS-16985: Summary: delete local block file when FileNotFoundException occurred may lead to missing block. Key: HDFS-16985 URL: https://issues.apache.org/jira/browse/HDFS-16985 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Chengwei Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16985) delete local block file when FileNotFoundException occurred may lead to missing block.
[ https://issues.apache.org/jira/browse/HDFS-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang reassigned HDFS-16985: Assignee: Chengwei Wang > delete local block file when FileNotFoundException occurred may lead to > missing block. > -- > > Key: HDFS-16985 > URL: https://issues.apache.org/jira/browse/HDFS-16985 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16762) Make the default value of dfs.federation.router.client.allow-partial-listing as false.
Chengwei Wang created HDFS-16762: Summary: Make the default value of dfs.federation.router.client.allow-partial-listing as false. Key: HDFS-16762 URL: https://issues.apache.org/jira/browse/HDFS-16762 Project: Hadoop HDFS Issue Type: Improvement Components: rbf Reporter: Chengwei Wang Assignee: Chengwei Wang AS the default value of _*dfs.federation.router.client.allow-partial-listing*_ is {*}_true_{*}, the hdfs client will got _*partial result*_ when one or more of the subclusters are unavailable for no permissions or other Exceptions, but _*user may not know.*_ It will lead to some fault. So I think it's better to make the default value as false. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14082) RBF: Add option to fail operations when a subcluster is unavailable
[ https://issues.apache.org/jira/browse/HDFS-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17598236#comment-17598236 ] Chengwei Wang commented on HDFS-14082: -- Thanks [~elgoiri] for the reply. Making the default false is a simple way to fix the issue. If need, I am willing to create a PR to fix it. But, I think the key issue is whether we need and how to inform user the result is not complete when doing fault tolerance. What's your suggestion? > RBF: Add option to fail operations when a subcluster is unavailable > --- > > Key: HDFS-14082 > URL: https://issues.apache.org/jira/browse/HDFS-14082 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Fix For: HDFS-13891 > > Attachments: HDFS-14082-HDFS-13891.002.patch, > HDFS-14082-HDFS-13891.003.patch, HDFS-14082.000.patch, HDFS-14082.001.patch > > > When a subcluster is unavailable, we succeed operations like > {{getListing()}}. We should add an option to fail the operation if one of the > subclusters is unavailable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14082) RBF: Add option to fail operations when a subcluster is unavailable
[ https://issues.apache.org/jira/browse/HDFS-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17597712#comment-17597712 ] Chengwei Wang commented on HDFS-14082: -- Hi [~elgoiri] , as the default value of _*dfs.federation.router.client.allow-partial-listing*_ is {*}_true_{*}, the hdfs client will got _*partial result*_ when one or more of the subclusters are unavailable, but {_}*user would not be awareness of it*{_}. I think it's better to set the default vale as false or throw a _*PartialResultExeption*_ to hdfs client and let the client decide if partial result is allowed. What do you think? > RBF: Add option to fail operations when a subcluster is unavailable > --- > > Key: HDFS-14082 > URL: https://issues.apache.org/jira/browse/HDFS-14082 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Fix For: HDFS-13891 > > Attachments: HDFS-14082-HDFS-13891.002.patch, > HDFS-14082-HDFS-13891.003.patch, HDFS-14082.000.patch, HDFS-14082.001.patch > > > When a subcluster is unavailable, we succeed operations like > {{getListing()}}. We should add an option to fail the operation if one of the > subclusters is unavailable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16554) Remove unused configuration dfs.namenode.block.deletion.increment.
[ https://issues.apache.org/jira/browse/HDFS-16554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17527131#comment-17527131 ] Chengwei Wang commented on HDFS-16554: -- Hi [~hexiaoqiao] , cloud you please take a look? > Remove unused configuration dfs.namenode.block.deletion.increment. > --- > > Key: HDFS-16554 > URL: https://issues.apache.org/jira/browse/HDFS-16554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The configuration *_dfs.namenode.block.deletion.increment_* will not be used > after the feature HDFS-16043 that do block deletetion asynchronously. So it's > better to remove it. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16553) Fix checkstyle for the length of BlockManager construction method over limit.
[ https://issues.apache.org/jira/browse/HDFS-16553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526971#comment-17526971 ] Chengwei Wang commented on HDFS-16553: -- Hi [~hexiaoqiao] , could you please take a look? > Fix checkstyle for the length of BlockManager construction method over limit. > - > > Key: HDFS-16553 > URL: https://issues.apache.org/jira/browse/HDFS-16553 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The length of BlockManager construction method is 156 lines which is over > 150 limit for BlockManager, do refactor the method to fix the checkstyle. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16554) Remove unused configuration dfs.namenode.block.deletion.increment.
[ https://issues.apache.org/jira/browse/HDFS-16554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16554: - Description: The configuration *_dfs.namenode.block.deletion.increment_* will not be used after the feature HDFS-16043 that do block deletetion asynchronously. So it's better to remove it. (was: The configuration *_dfs.namenode.block.deletion.increment_* no used after the feature HDFS-16043 that do block deletetion asynchronously. So it's better to remove it.) > Remove unused configuration dfs.namenode.block.deletion.increment. > --- > > Key: HDFS-16554 > URL: https://issues.apache.org/jira/browse/HDFS-16554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > The configuration *_dfs.namenode.block.deletion.increment_* will not be used > after the feature HDFS-16043 that do block deletetion asynchronously. So it's > better to remove it. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16554) Remove unused configuration dfs.namenode.block.deletion.increment.
[ https://issues.apache.org/jira/browse/HDFS-16554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16554: - Description: The configuration *_dfs.namenode.block.deletion.increment_* no used after the feature HDFS-16043 that do block deletetion asynchronously. So it's better to remove it. (was: The configuration *_dfs.namenode.block.deletion.increment_* no longer in use after the feature HDFS-16043 that do block deletetion asynchronously. So it's better to remove it.) > Remove unused configuration dfs.namenode.block.deletion.increment. > --- > > Key: HDFS-16554 > URL: https://issues.apache.org/jira/browse/HDFS-16554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > The configuration *_dfs.namenode.block.deletion.increment_* no used after > the feature HDFS-16043 that do block deletetion asynchronously. So it's > better to remove it. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16554) Remove unused configuration dfs.namenode.block.deletion.increment.
[ https://issues.apache.org/jira/browse/HDFS-16554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16554: - Description: The configuration *_dfs.namenode.block.deletion.increment_* no longer in use after the feature HDFS-16043 that do block deletetion asynchronously. So it's better to remove it. (was: The configuration *_dfs.namenode.block.deletion.increment_* became unused after the feature HDFS-16043 that do block deletetion asynchronously. So it's better to remove it.) > Remove unused configuration dfs.namenode.block.deletion.increment. > --- > > Key: HDFS-16554 > URL: https://issues.apache.org/jira/browse/HDFS-16554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > The configuration *_dfs.namenode.block.deletion.increment_* no longer in > use after the feature HDFS-16043 that do block deletetion asynchronously. So > it's better to remove it. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16554) Remove unused configuration dfs.namenode.block.deletion.increment.
[ https://issues.apache.org/jira/browse/HDFS-16554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16554: - Description: The configuration *_dfs.namenode.block.deletion.increment_* became unused after the feature HDFS-16043 that do block deletetion asynchronously. So it's better to remove it. (was: The configuration *_dfs.namenode.block.deletion.increment_* became unused with HDFS-16043 taht do block deletetion asynchronously. So it's better to remove it.) > Remove unused configuration dfs.namenode.block.deletion.increment. > --- > > Key: HDFS-16554 > URL: https://issues.apache.org/jira/browse/HDFS-16554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > The configuration *_dfs.namenode.block.deletion.increment_* became unused > after the feature HDFS-16043 that do block deletetion asynchronously. So it's > better to remove it. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16554) Remove unused configuration dfs.namenode.block.deletion.increment.
[ https://issues.apache.org/jira/browse/HDFS-16554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16554: - Description: The configuration *_dfs.namenode.block.deletion.increment_* became unused with HDFS-16043 taht do block deletetion asynchronously. So it's better to remove it. (was: the feature HDFS-16043 to our internal branch, it works well in our testing cluster.) > Remove unused configuration dfs.namenode.block.deletion.increment. > --- > > Key: HDFS-16554 > URL: https://issues.apache.org/jira/browse/HDFS-16554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > The configuration *_dfs.namenode.block.deletion.increment_* became unused > with HDFS-16043 taht do block deletetion asynchronously. So it's better to > remove it. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16554) Remove unused configuration dfs.namenode.block.deletion.increment.
[ https://issues.apache.org/jira/browse/HDFS-16554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16554: - Description: the feature HDFS-16043 to our internal branch, it works well in our testing cluster. > Remove unused configuration dfs.namenode.block.deletion.increment. > --- > > Key: HDFS-16554 > URL: https://issues.apache.org/jira/browse/HDFS-16554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > the feature HDFS-16043 to our internal branch, it works well in our testing > cluster. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16554) Remove unused configuration dfs.namenode.block.deletion.increment.
Chengwei Wang created HDFS-16554: Summary: Remove unused configuration dfs.namenode.block.deletion.increment. Key: HDFS-16554 URL: https://issues.apache.org/jira/browse/HDFS-16554 Project: Hadoop HDFS Issue Type: Improvement Reporter: Chengwei Wang Assignee: Chengwei Wang -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16553) Fix checkstyle for the length of BlockManager construction method over limit.
[ https://issues.apache.org/jira/browse/HDFS-16553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16553: - Description: The length of BlockManager construction method is 156 lines which is over 150 limit for BlockManager, do refactor the method to fix the checkstyle. (was: The length of BlockManager construction method is 156 lines which is over 150 limit for BlockManager, refactor to fix the checkstyle.) > Fix checkstyle for the length of BlockManager construction method over limit. > - > > Key: HDFS-16553 > URL: https://issues.apache.org/jira/browse/HDFS-16553 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > The length of BlockManager construction method is 156 lines which is over > 150 limit for BlockManager, do refactor the method to fix the checkstyle. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16553) Fix checkstyle for the length of BlockManager construction method over limit.
[ https://issues.apache.org/jira/browse/HDFS-16553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16553: - Description: The length of BlockManager construction method is 156 lines which is over 150 limit for BlockManager, refactor to fix the checkstyle. > Fix checkstyle for the length of BlockManager construction method over limit. > - > > Key: HDFS-16553 > URL: https://issues.apache.org/jira/browse/HDFS-16553 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > The length of BlockManager construction method is 156 lines which is over > 150 limit for BlockManager, refactor to fix the checkstyle. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16553) Fix checkstyle for the length of BlockManager construction method over limit.
Chengwei Wang created HDFS-16553: Summary: Fix checkstyle for the length of BlockManager construction method over limit. Key: HDFS-16553 URL: https://issues.apache.org/jira/browse/HDFS-16553 Project: Hadoop HDFS Issue Type: Improvement Reporter: Chengwei Wang Assignee: Chengwei Wang -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16500) Make asynchronous blocks deletion lock and unlock durtion threshold configurable
[ https://issues.apache.org/jira/browse/HDFS-16500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17504693#comment-17504693 ] Chengwei Wang commented on HDFS-16500: -- [~hexiaoqiao] [~zhuxiangyi] could you please take a look? > Make asynchronous blocks deletion lock and unlock durtion threshold > configurable > - > > Key: HDFS-16500 > URL: https://issues.apache.org/jira/browse/HDFS-16500 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namanode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > I have backport the nice feature HDFS-16043 to our internal branch, it works > well in our testing cluster. > I think it's better to make the fields *_deleteBlockLockTimeMs_* and > *_deleteBlockUnlockIntervalTimeMs_* configurable, so that we can control the > lock and unlock duration. > {code:java} > private final long deleteBlockLockTimeMs = 500; > private final long deleteBlockUnlockIntervalTimeMs = 100;{code} > And we should set the default value smaller to avoid blocking other requests > long time when deleting some large directories. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16500) Make asynchronous blocks deletion lock and unlock durtion threshold configurable
[ https://issues.apache.org/jira/browse/HDFS-16500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17504202#comment-17504202 ] Chengwei Wang commented on HDFS-16500: -- Create PR https://github.com/apache/hadoop/pull/4061 > Make asynchronous blocks deletion lock and unlock durtion threshold > configurable > - > > Key: HDFS-16500 > URL: https://issues.apache.org/jira/browse/HDFS-16500 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namanode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > I have backport the nice feature HDFS-16043 to our internal branch, it works > well in our testing cluster. > I think it's better to make the fields *_deleteBlockLockTimeMs_* and > *_deleteBlockUnlockIntervalTimeMs_* configurable, so that we can control the > lock and unlock duration. > {code:java} > private final long deleteBlockLockTimeMs = 500; > private final long deleteBlockUnlockIntervalTimeMs = 100;{code} > And we should set the default value smaller to avoid blocking other requests > long time when deleting some large directories. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16500) Make asynchronous blocks deletion lock and unlock durtion threshold configurable
[ https://issues.apache.org/jira/browse/HDFS-16500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16500: - Description: I have backport the nice feature HDFS-16043 to our internal branch, it works well in our testing cluster. I think it's better to make the fields *_deleteBlockLockTimeMs_* and *_deleteBlockUnlockIntervalTimeMs_* configurable, so that we can control the lock and unlock duration. {code:java} private final long deleteBlockLockTimeMs = 500; private final long deleteBlockUnlockIntervalTimeMs = 100;{code} And we should set the default value smaller to avoid blocking other requests long time when deleting some large directories. was: I have backport the nice feature HDFS-16043 to our internal branch, it works well in our testing cluster. I think it's better to make the fileds deleteBlockLockTimeMs and deleteBlockUnlockIntervalTimeMs configurable, so that we can control the lock and unlock duration. {code:java} private final long deleteBlockLockTimeMs = 500; private final long deleteBlockUnlockIntervalTimeMs = 100;{code} And we should set the default value smaller to avoid blocking other requests long time when deleting some large directories. > Make asynchronous blocks deletion lock and unlock durtion threshold > configurable > - > > Key: HDFS-16500 > URL: https://issues.apache.org/jira/browse/HDFS-16500 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namanode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > I have backport the nice feature HDFS-16043 to our internal branch, it works > well in our testing cluster. > I think it's better to make the fields *_deleteBlockLockTimeMs_* and > *_deleteBlockUnlockIntervalTimeMs_* configurable, so that we can control the > lock and unlock duration. > {code:java} > private final long deleteBlockLockTimeMs = 500; > private final long deleteBlockUnlockIntervalTimeMs = 100;{code} > And we should set the default value smaller to avoid blocking other requests > long time when deleting some large directories. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16500) Make asynchronous blocks deletion lock and unlock durtion threshold configurable
[ https://issues.apache.org/jira/browse/HDFS-16500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16500: - Description: I have backport the nice feature HDFS-16043 to our internal branch, it works well in our testing cluster. I think it's better to make the fileds deleteBlockLockTimeMs and deleteBlockUnlockIntervalTimeMs configurable, so that we can control the lock and unlock duration. {code:java} private final long deleteBlockLockTimeMs = 500; private final long deleteBlockUnlockIntervalTimeMs = 100;{code} And we should set the default value smaller to avoid blocking other requests long time when deleting some large directories. was: I have backport the nice feature HDFS-16043 to our internal branch, it works well in our testing cluster. I think it's better to make the fileds deleteBlockLockTimeMs and deleteBlockUnlockIntervalTimeMs configurable, so that we can control the lock and unlock duration. {code:java} private final long deleteBlockLockTimeMs = 500; private final long deleteBlockUnlockIntervalTimeMs = 100;{code} And we should set the default value smaller to avoid blocking other requests long time when delete some large directories. > Make asynchronous blocks deletion lock and unlock durtion threshold > configurable > - > > Key: HDFS-16500 > URL: https://issues.apache.org/jira/browse/HDFS-16500 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namanode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > I have backport the nice feature HDFS-16043 to our internal branch, it works > well in our testing cluster. > I think it's better to make the fileds deleteBlockLockTimeMs and > deleteBlockUnlockIntervalTimeMs configurable, so that we can control the lock > and unlock duration. > {code:java} > private final long deleteBlockLockTimeMs = 500; > private final long deleteBlockUnlockIntervalTimeMs = 100;{code} > And we should set the default value smaller to avoid blocking other requests > long time when deleting some large directories. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16500) Make asynchronous blocks deletion lock and unlock durtion threshold configurable
[ https://issues.apache.org/jira/browse/HDFS-16500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16500: - Description: I have backport the nice feature HDFS-16043 to our internal branch, it works well in our testing cluster. I think it's better to make the fileds deleteBlockLockTimeMs and deleteBlockUnlockIntervalTimeMs configurable, so that we can control the lock and unlock duration. {code:java} private final long deleteBlockLockTimeMs = 500; private final long deleteBlockUnlockIntervalTimeMs = 100;{code} And we should set the default value smaller to avoid blocking other requests long time when delete some large directories. was: I have backport the nice feature HDFS-16043 to our internal branch, it works well in our testing cluster. I think it's better to make the fileds deleteBlockLockTimeMs and deleteBlockUnlockIntervalTimeMs configurable, so that we can control the lock and unlock duration. And we should set the default value smaller to avoid blocking other requests long time. > Make asynchronous blocks deletion lock and unlock durtion threshold > configurable > - > > Key: HDFS-16500 > URL: https://issues.apache.org/jira/browse/HDFS-16500 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namanode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > I have backport the nice feature HDFS-16043 to our internal branch, it works > well in our testing cluster. > I think it's better to make the fileds deleteBlockLockTimeMs and > deleteBlockUnlockIntervalTimeMs configurable, so that we can control the lock > and unlock duration. > {code:java} > private final long deleteBlockLockTimeMs = 500; > private final long deleteBlockUnlockIntervalTimeMs = 100;{code} > And we should set the default value smaller to avoid blocking other requests > long time when delete some large directories. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16500) Make asynchronous blocks deletion lock and unlock durtion threshold configurable
Chengwei Wang created HDFS-16500: Summary: Make asynchronous blocks deletion lock and unlock durtion threshold configurable Key: HDFS-16500 URL: https://issues.apache.org/jira/browse/HDFS-16500 Project: Hadoop HDFS Issue Type: Improvement Components: namanode Reporter: Chengwei Wang Assignee: Chengwei Wang I have backport the nice feature HDFS-16043 to our internal branch, it works well in our testing cluster. I think it's better to make the fileds deleteBlockLockTimeMs and deleteBlockUnlockIntervalTimeMs configurable, so that we can control the lock and unlock duration. And we should set the default value smaller to avoid blocking other requests long time. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16431) Truncate CallerContext in client side
[ https://issues.apache.org/jira/browse/HDFS-16431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479784#comment-17479784 ] Chengwei Wang commented on HDFS-16431: -- Hello [~hexiaoqiao] [~weichiu] [Stephen O'Donnell|https://github.com/sodonnel], cloud you please take a look? > Truncate CallerContext in client side > - > > Key: HDFS-16431 > URL: https://issues.apache.org/jira/browse/HDFS-16431 > Project: Hadoop HDFS > Issue Type: Improvement > Components: nn >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The context of CallerContext would be truncated when it exceeds the maximum > allowed length in server side. I think it's better to do check and truncate > in client side to reduce the unnecessary overhead of network and memory for > NN. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16431) Truncate CallerContext in client side
[ https://issues.apache.org/jira/browse/HDFS-16431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479313#comment-17479313 ] Chengwei Wang commented on HDFS-16431: -- Create pr [https://github.com/apache/hadoop/pull/3909.] Refactor CallerContext.Builder constrctor to simplify usage and do context truncate before _CallerContext.Builder.build()_ > Truncate CallerContext in client side > - > > Key: HDFS-16431 > URL: https://issues.apache.org/jira/browse/HDFS-16431 > Project: Hadoop HDFS > Issue Type: Improvement > Components: nn >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The context of CallerContext would be truncated when it exceeds the maximum > allowed length in server side. I think it's better to do check and truncate > in client side to reduce the unnecessary overhead of network and memory for > NN. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16431) Truncate CallerContext in client side
Chengwei Wang created HDFS-16431: Summary: Truncate CallerContext in client side Key: HDFS-16431 URL: https://issues.apache.org/jira/browse/HDFS-16431 Project: Hadoop HDFS Issue Type: Improvement Components: nn Reporter: Chengwei Wang Assignee: Chengwei Wang The context of CallerContext would be truncated when it exceeds the maximum allowed length in server side. I think it's better to do check and truncate in client side to reduce the unnecessary overhead of network and memory for NN. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16308) Enable get command run with multi-thread.
[ https://issues.apache.org/jira/browse/HDFS-16308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang resolved HDFS-16308. -- Resolution: Auto Closed > Enable get command run with multi-thread. > - > > Key: HDFS-16308 > URL: https://issues.apache.org/jira/browse/HDFS-16308 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and > HADOOP-14698, and make put dirs or multiple files faster. > So, It's necessary to enable get and copyToLocal command run with > multi-thread. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16308) Enable get command run with multi-thread.
[ https://issues.apache.org/jira/browse/HDFS-16308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440878#comment-17440878 ] Chengwei Wang commented on HDFS-16308: -- Close this issue for wrong hadoop component choosen. Create HADOOP-17998 instead it. > Enable get command run with multi-thread. > - > > Key: HDFS-16308 > URL: https://issues.apache.org/jira/browse/HDFS-16308 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and > HADOOP-14698, and make put dirs or multiple files faster. > So, It's necessary to enable get and copyToLocal command run with > multi-thread. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16308) Enable get command run with multi-thread.
[ https://issues.apache.org/jira/browse/HDFS-16308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16308: - Description: CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and HADOOP-14698, and make put dirs or multiple files faster. So, It's necessary to enable get and copyToLocal command run with multi-thread. was: CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and HADOOP-14698 . It's useful to enable get and copyToLocal command run with multi-thread. > Enable get command run with multi-thread. > - > > Key: HDFS-16308 > URL: https://issues.apache.org/jira/browse/HDFS-16308 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and > HADOOP-14698, and make put dirs or multiple files faster. > So, It's necessary to enable get and copyToLocal command run with > multi-thread. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16308) Enable get command run with multi-thread.
[ https://issues.apache.org/jira/browse/HDFS-16308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16308: - Description: CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and HADOOP-14698 . It's useful to enable get and copyToLocal command run with multi-thread. was:It's useful to enable get and copyToLocal command run with multi-thread. > Enable get command run with multi-thread. > - > > Key: HDFS-16308 > URL: https://issues.apache.org/jira/browse/HDFS-16308 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and > HADOOP-14698 . > It's useful to enable get and copyToLocal command run with multi-thread. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16308) Enable get command run with multi-thread.
[ https://issues.apache.org/jira/browse/HDFS-16308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16308: - External issue ID: (was: HDFS-11786) > Enable get command run with multi-thread. > - > > Key: HDFS-16308 > URL: https://issues.apache.org/jira/browse/HDFS-16308 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > It's useful to enable get and copyToLocal command run with multi-thread. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16308) Enable get command run with multi-thread.
Chengwei Wang created HDFS-16308: Summary: Enable get command run with multi-thread. Key: HDFS-16308 URL: https://issues.apache.org/jira/browse/HDFS-16308 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Reporter: Chengwei Wang Assignee: Chengwei Wang It's useful to enable get and copyToLocal command run with multi-thread. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14815) RBF: Update the quota in MountTable when calling setQuota on a MountTable src.
[ https://issues.apache.org/jira/browse/HDFS-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434763#comment-17434763 ] Chengwei Wang edited comment on HDFS-14815 at 10/27/21, 9:56 AM: - HI [~LiJinglun] [~ayushtkn], I think this feature may make users a little confused. Users will get different result when they try to set quota for mount table paths and normal paths by dfsadmin CLI or FileSystem#setQuota interface with _*fs.defaultFS=router*_. especially when call FileSystem#setQuota inside users code. In other word, the _*ClientProtocol#setQuota*_ interface implement inside router is just partial available, this seems illogical and would make users doubt whether the hdfs cluster is healthy. In my opnions, _*ClientProtocol#setQuota as a public api,*_ the requests should be all passed or all rejected to keep behavior consistent. So, I prefer approach 2 and 3, and we can choose consistent strategy (*Strong* or *Eventually*) by configuration. was (Author: smarthan): HI [~LiJinglun] [~ayushtkn], I think this feature may make users a little confused. Users will get different result when they try to set quota for mount table paths and normal paths by dfsadmin CLI or FileSystem#setQuota interface with _*fs.defaultFS=router*_. especially when call FileSystem#setQuota inside users code. In other word, the _*ClientProtocol#setQuota*_ interface implement inside router is just partial available, this seems illogical and would make users doubt whether the hdfs cluster is healthy. In my opnions, _*ClientProtocol#setQuota as a public api,*_ the ** requests should be all passed or all rejected to keep behavior consistent. So, I prefer approach 2 and 3, and we can choose consistent strategy (*Strong* or *Eventually*) by configuration. > RBF: Update the quota in MountTable when calling setQuota on a MountTable src. > -- > > Key: HDFS-14815 > URL: https://issues.apache.org/jira/browse/HDFS-14815 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14815.001.patch, HDFS-14815.002.patch, > HDFS-14815.003.patch, HDFS-14815.004.patch, HDFS-14815.005.patch > > > The method setQuota() can make the remote quota(the quota on real clusters) > inconsistent with the MountTable. I think we have 3 ways to fix it: > # Reject all the setQuota() rpcs if it trys to change the quota of a mount > table. > # Let setQuota() to change the mount table quota. Update the quota on zk > first and then update remote quotas. > # Do nothing. The RouterQuotaUpdateService will finally make all the remote > quota right. We can tolerate short-term inconsistencies. > I'd like option 1 because I want the RouterAdmin to be the only entrance to > update the MountTable. > Option 3 we don't need change anything, but the quota will be inconsistent > for a short-term. The remote quota will be effective immediately and > auto-changed back after a while. User might be confused about the behavior. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14815) RBF: Update the quota in MountTable when calling setQuota on a MountTable src.
[ https://issues.apache.org/jira/browse/HDFS-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434763#comment-17434763 ] Chengwei Wang edited comment on HDFS-14815 at 10/27/21, 9:55 AM: - HI [~LiJinglun] [~ayushtkn], I think this feature may make users a little confused. Users will get different result when they try to set quota for mount table paths and normal paths by dfsadmin CLI or FileSystem#setQuota interface with _*fs.defaultFS=router*_. especially when call FileSystem#setQuota inside users code. In other word, the _*ClientProtocol#setQuota*_ interface implement inside router is just partial available, this seems illogical and would make users doubt whether the hdfs cluster is healthy. In my opnions, _*ClientProtocol#setQuota as a public api,*_ the ** requests should be all passed or all rejected to keep behavior consistent. So, I prefer approach 2 and 3, and we can choose consistent strategy (*Strong* or *Eventually*) by configuration. was (Author: smarthan): HI [~LiJinglun] [~ayushtkn], I think this feature may make users a little confused. Users will get different result when they try to set quota for mount table paths and normal paths by dfsadmin CLI or FileSystem#setQuota interface with _*fs.defaultFS=router*_. especially when call FileSystem#setQuota inside users code. In other word, the _*ClientProtocol#setQuota*_ interface implement inside router is just partial available, this seems illogical and would make users doubt whether the hdfs cluster is healthy. In my opnions, _*ClientProtocol#setQuota as a public api,*_ the** _**_ requests should be all passed or all rejected to keep behavior consistent. So, I prefer approach 2 and 3, and we can choose consistent strategy (*Strong* or *Eventually*) by configuration. > RBF: Update the quota in MountTable when calling setQuota on a MountTable src. > -- > > Key: HDFS-14815 > URL: https://issues.apache.org/jira/browse/HDFS-14815 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14815.001.patch, HDFS-14815.002.patch, > HDFS-14815.003.patch, HDFS-14815.004.patch, HDFS-14815.005.patch > > > The method setQuota() can make the remote quota(the quota on real clusters) > inconsistent with the MountTable. I think we have 3 ways to fix it: > # Reject all the setQuota() rpcs if it trys to change the quota of a mount > table. > # Let setQuota() to change the mount table quota. Update the quota on zk > first and then update remote quotas. > # Do nothing. The RouterQuotaUpdateService will finally make all the remote > quota right. We can tolerate short-term inconsistencies. > I'd like option 1 because I want the RouterAdmin to be the only entrance to > update the MountTable. > Option 3 we don't need change anything, but the quota will be inconsistent > for a short-term. The remote quota will be effective immediately and > auto-changed back after a while. User might be confused about the behavior. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14815) RBF: Update the quota in MountTable when calling setQuota on a MountTable src.
[ https://issues.apache.org/jira/browse/HDFS-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434763#comment-17434763 ] Chengwei Wang commented on HDFS-14815: -- HI [~LiJinglun] [~ayushtkn], I think this feature may make users a little confused. Users will get different result when they try to set quota for mount table paths and normal paths by dfsadmin CLI or FileSystem#setQuota interface with _*fs.defaultFS=router*_. especially when call FileSystem#setQuota inside users code. In other word, the _*ClientProtocol#setQuota*_ interface implement inside router is just partial available, this seems illogical and would make users doubt whether the hdfs cluster is healthy. In my opnions, _*ClientProtocol#setQuota as a public api,*_ the** _**_ requests should be all passed or all rejected to keep behavior consistent. So, I prefer approach 2 and 3, and we can choose consistent strategy (*Strong* or *Eventually*) by configuration. > RBF: Update the quota in MountTable when calling setQuota on a MountTable src. > -- > > Key: HDFS-14815 > URL: https://issues.apache.org/jira/browse/HDFS-14815 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14815.001.patch, HDFS-14815.002.patch, > HDFS-14815.003.patch, HDFS-14815.004.patch, HDFS-14815.005.patch > > > The method setQuota() can make the remote quota(the quota on real clusters) > inconsistent with the MountTable. I think we have 3 ways to fix it: > # Reject all the setQuota() rpcs if it trys to change the quota of a mount > table. > # Let setQuota() to change the mount table quota. Update the quota on zk > first and then update remote quotas. > # Do nothing. The RouterQuotaUpdateService will finally make all the remote > quota right. We can tolerate short-term inconsistencies. > I'd like option 1 because I want the RouterAdmin to be the only entrance to > update the MountTable. > Option 3 we don't need change anything, but the quota will be inconsistent > for a short-term. The remote quota will be effective immediately and > auto-changed back after a while. User might be confused about the behavior. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16271) RBF: NullPointerException when setQuota through routers with quota disabled
[ https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428575#comment-17428575 ] Chengwei Wang commented on HDFS-16271: -- Hi [~elgoiri], thanks for your review. Submit patch v002 [^HDFS-16271.002.patch] > RBF: NullPointerException when setQuota through routers with quota disabled > --- > > Key: HDFS-16271 > URL: https://issues.apache.org/jira/browse/HDFS-16271 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.1 >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Fix For: 3.3.1 > > Attachments: HDFS-16271.001.patch, HDFS-16271.002.patch > > > When we started routers with *dfs.federation.router.quota.enable=false*, and > try to setQuota through them, NullPointerException caught. > The cuase of NPE is that the Router#quotaManager not initialized when > dfs.federation.router.quota.enable=false, > but when executing setQuota rpc request inside router, we wolud use it in > method Quota#isMountEntry without null check . > I think it's better to check whether Router#isQuotaEnabled is true before use > Router#quotaManager, and throw an IOException with readable message if need. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16271) RBF: NullPointerException when setQuota through routers with quota disabled
[ https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16271: - Attachment: HDFS-16271.002.patch > RBF: NullPointerException when setQuota through routers with quota disabled > --- > > Key: HDFS-16271 > URL: https://issues.apache.org/jira/browse/HDFS-16271 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.1 >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Fix For: 3.3.1 > > Attachments: HDFS-16271.001.patch, HDFS-16271.002.patch > > > When we started routers with *dfs.federation.router.quota.enable=false*, and > try to setQuota through them, NullPointerException caught. > The cuase of NPE is that the Router#quotaManager not initialized when > dfs.federation.router.quota.enable=false, > but when executing setQuota rpc request inside router, we wolud use it in > method Quota#isMountEntry without null check . > I think it's better to check whether Router#isQuotaEnabled is true before use > Router#quotaManager, and throw an IOException with readable message if need. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with quota disabled.
[ https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16271: - Attachment: HDFS-16271.001.patch Fix Version/s: 3.3.1 Tags: RBF Target Version/s: 3.3.1 Affects Version/s: 3.3.1 Status: Patch Available (was: Open) Submit patch v001. > NullPointerException when setQuota through routers with quota disabled. > --- > > Key: HDFS-16271 > URL: https://issues.apache.org/jira/browse/HDFS-16271 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.1 >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Fix For: 3.3.1 > > Attachments: HDFS-16271.001.patch > > > When we started routers with *dfs.federation.router.quota.enable=false*, and > try to setQuota through them, NullPointerException caught. > The cuase of NPE is that the Router#quotaManager not initialized when > dfs.federation.router.quota.enable=false, > but when executing setQuota rpc request inside router, we wolud use it in > method Quota#isMountEntry without null check . > I think it's better to check whether Router#isQuotaEnabled is true before use > Router#quotaManager, and throw an IOException with readable message if need. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with quota disabled.
[ https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16271: - Summary: NullPointerException when setQuota through routers with quota disabled. (was: NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false.) > NullPointerException when setQuota through routers with quota disabled. > --- > > Key: HDFS-16271 > URL: https://issues.apache.org/jira/browse/HDFS-16271 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > When we started routers with *dfs.federation.router.quota.enable=false*, and > try to setQuota through them, NullPointerException caught. > The cuase of NPE is that the Router#quotaManager not initialized when > dfs.federation.router.quota.enable=false, > but when executing setQuota rpc request inside router, we wolud use it in > method Quota#isMountEntry without null check . > I think it's better to check whether Router#isQuotaEnabled is true before use > Router#quotaManager, and throw an IOException with readable message if need. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false.
[ https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16271: - Description: When we started routers with *dfs.federation.router.quota.enable=false*, and try to setQuota through them, NullPointerException caught. The cuase of NPE is that the Router#quotaManager not initialized when dfs.federation.router.quota.enable=false, but when executing setQuota rpc request inside router, we wolud use it in method Quota#isMountEntry without null check . I think it's better to check whether Router#isQuotaEnabled is true before use Router#quotaManager, and throw an IOException with readable message if need. was: When we started routers with *dfs.federation.router.quota.enable=false*, and try to setQuota through them, NullPointerException caught. The cuase of NPE is that the Router#quotaManager not initialized when dfs.federation.router.quota.enable=false, but when executing setQuota rpc request inside router, we wolud use it in method Quota#isMountEntry without null check . I think it's better to check whether Router#quotaManager is null, and throw a IOException with readable message if null. > NullPointerException when setQuota through routers with > dfs.federation.router.quota.enable=false. > - > > Key: HDFS-16271 > URL: https://issues.apache.org/jira/browse/HDFS-16271 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > When we started routers with *dfs.federation.router.quota.enable=false*, and > try to setQuota through them, NullPointerException caught. > The cuase of NPE is that the Router#quotaManager not initialized when > dfs.federation.router.quota.enable=false, > but when executing setQuota rpc request inside router, we wolud use it in > method Quota#isMountEntry without null check . > I think it's better to check whether Router#isQuotaEnabled is true before use > Router#quotaManager, and throw an IOException with readable message if need. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false.
[ https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16271: - Description: When we started routers with *dfs.federation.router.quota.enable=false*, and try to setQuota through them, NullPointerException caught. The cuase of NPE is that the Router#quotaManager not initialized when dfs.federation.router.quota.enable=false, but when executing setQuota rpc request inside router, we wolud use it in method Quota#isMountEntry without null check . I think it's better to check whether Router#quotaManager is null, and throw a IOException with readable message if null. was: When we started routers with dfs.federation.router.quota.enable=false, and try to setQuota through them, NullPointerException caught. The cuase of NPE is that the Router#quotaManager not initialized when dfs.federation.router.quota.enable=false, but when executing setQuota rpc request inside router, we wolud use it in method Quota#isMountEntry without null check . I think it's better to check whether Router#quotaManager is null, and throw a IOException with readable message if null. > NullPointerException when setQuota through routers with > dfs.federation.router.quota.enable=false. > - > > Key: HDFS-16271 > URL: https://issues.apache.org/jira/browse/HDFS-16271 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > When we started routers with *dfs.federation.router.quota.enable=false*, and > try to setQuota through them, NullPointerException caught. > The cuase of NPE is that the Router#quotaManager not initialized when > dfs.federation.router.quota.enable=false, > but when executing setQuota rpc request inside router, we wolud use it in > method Quota#isMountEntry without null check . > I think it's better to check whether Router#quotaManager is null, and throw a > IOException with readable message if null. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false.
[ https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16271: - Description: When we started routers with dfs.federation.router.quota.enable=false, and try to setQuota through them, NullPointerException caught. The cuase of NPE is that the Router#quotaManager not initialized when dfs.federation.router.quota.enable=false, but when executing setQuota rpc request inside router, we wolud use it in method Quota#isMountEntry without null check . I think it's better to check whether Router#quotaManager is null, and throw a IOException with readable message if null. > NullPointerException when setQuota through routers with > dfs.federation.router.quota.enable=false. > - > > Key: HDFS-16271 > URL: https://issues.apache.org/jira/browse/HDFS-16271 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > When we started routers with dfs.federation.router.quota.enable=false, and > try to setQuota through them, NullPointerException caught. > The cuase of NPE is that the Router#quotaManager not initialized when > dfs.federation.router.quota.enable=false, > but when executing setQuota rpc request inside router, we wolud use it in > method Quota#isMountEntry without null check . > I think it's better to check whether Router#quotaManager is null, and throw a > IOException with readable message if null. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false.
[ https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16271: - Description: (was: 当我尝试通过带有*_dfs.federation.router.quota.enable=false 的_* 路由器设置 *_配额时,_*捕捉到 NullPointerException) > NullPointerException when setQuota through routers with > dfs.federation.router.quota.enable=false. > - > > Key: HDFS-16271 > URL: https://issues.apache.org/jira/browse/HDFS-16271 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false.
[ https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16271: - Description: 当我尝试通过带有*_dfs.federation.router.quota.enable=false 的_* 路由器设置 *_配额时,_*捕捉到 NullPointerException (was: When I try to setQuota through routers with *_dfs.federation.router.quota.enable=false,_* ) > NullPointerException when setQuota through routers with > dfs.federation.router.quota.enable=false. > - > > Key: HDFS-16271 > URL: https://issues.apache.org/jira/browse/HDFS-16271 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > 当我尝试通过带有*_dfs.federation.router.quota.enable=false 的_* 路由器设置 *_配额时,_*捕捉到 > NullPointerException -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16271) NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false.
Chengwei Wang created HDFS-16271: Summary: NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false. Key: HDFS-16271 URL: https://issues.apache.org/jira/browse/HDFS-16271 Project: Hadoop HDFS Issue Type: Bug Reporter: Chengwei Wang Assignee: Chengwei Wang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false.
[ https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16271: - Description: When I try to setQuota through routers with *_dfs.federation.router.quota.enable=false,_* (was: When ) > NullPointerException when setQuota through routers with > dfs.federation.router.quota.enable=false. > - > > Key: HDFS-16271 > URL: https://issues.apache.org/jira/browse/HDFS-16271 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > When I try to setQuota through routers with > *_dfs.federation.router.quota.enable=false,_* -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false.
[ https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-16271: - Description: When > NullPointerException when setQuota through routers with > dfs.federation.router.quota.enable=false. > - > > Key: HDFS-16271 > URL: https://issues.apache.org/jira/browse/HDFS-16271 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > When -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218709#comment-17218709 ] Chengwei Wang commented on HDFS-15630: -- [~ferhui] [~elgoiri], thanks for comments. {quote}Server.getCurCall().set(originCall); Is it necessary here? {quote} It's necessary for getting remote client ip through Server#getRemoteAddress(); Submit patch v006 [^HDFS-15630.006.patch] to complete javadoc for transferThreadLocalContext(). > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch, HDFS-15630.004.patch, HDFS-15630.005.patch, > HDFS-15630.006.patch, HDFS-15630.test.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Attachment: HDFS-15630.006.patch > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch, HDFS-15630.004.patch, HDFS-15630.005.patch, > HDFS-15630.006.patch, HDFS-15630.test.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218255#comment-17218255 ] Chengwei Wang edited comment on HDFS-15630 at 10/21/20, 12:00 PM: -- [~elgoiri] [~ferhui], Thanks for comments. After careful thought, this issue should be divided into 2 parts. # transfer the required thread local context to worker threads to avoid client ip and origin context lost. # append client ip to caller context only if it's absent to avoid client ip duplicated. To transfer context, it's better to set thread local variables directly rather than use method params. Just like this: {code:java} callables.add( () -> { transferThreadLocalContext(originCall, originContext); return invokeMethod(ugi, namenodes, proto, m, paramList); }); private void transferThreadLocalContext( final Call originCall, final CallerContext originContext) { Server.getCurCall().set(originCall); CallerContext.setCurrent(originContext); } {code} With these required context, we can always get client ip and caller context from thread local, and append client ip to caller context only if it’s absent. Submit patch v005 [^HDFS-15630.005.patch], could you please take a look? was (Author: smarthan): [~elgoiri] [~ferhui], Thanks for comments. After careful thought, this issue should be divided into 2 parts. # transfer the required thread local context to worker threads to avoid client ip and origin context lost. # append client ip to caller context only if it's absent to avoid client ip duplicated. To transfer context, I think it's better to set thread local variables directly rather than use method params. Just like this: {code:java} callables.add( () -> { transferThreadLocalContext(originCall, originContext); return invokeMethod(ugi, namenodes, proto, m, paramList); }); private void transferThreadLocalContext( final Call originCall, final CallerContext originContext) { Server.getCurCall().set(originCall); CallerContext.setCurrent(originContext); } {code} With these required context, we can always get client ip and caller context from thread local, and append client ip to caller context only if it’s absent. Submit patch v005 [^HDFS-15630.005.patch], could you please take a look? > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch, HDFS-15630.004.patch, HDFS-15630.005.patch, > HDFS-15630.test.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218255#comment-17218255 ] Chengwei Wang commented on HDFS-15630: -- [~elgoiri] [~ferhui], Thanks for comments. After careful thought, this issue should be divided into 2 parts. # transfer the required thread local context to worker threads to avoid client ip and origin context lost. # append client ip to caller context only if it's absent to avoid client ip duplicated. To transfer context, I think it's better to set thread local variables directly rather than use method params. Just like this: {code:java} callables.add( () -> { transferThreadLocalContext(originCall, originContext); return invokeMethod(ugi, namenodes, proto, m, paramList); }); private void transferThreadLocalContext( final Call originCall, final CallerContext originContext) { Server.getCurCall().set(originCall); CallerContext.setCurrent(originContext); } {code} With these required context, we can always get client ip and caller context from thread local, and append client ip to caller context only if it’s absent. Submit patch v005 [^HDFS-15630.005.patch], could you please take a look? > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch, HDFS-15630.004.patch, HDFS-15630.005.patch, > HDFS-15630.test.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Attachment: HDFS-15630.005.patch > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch, HDFS-15630.004.patch, HDFS-15630.005.patch, > HDFS-15630.test.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17217249#comment-17217249 ] Chengwei Wang commented on HDFS-15630: -- [~ferhui], thanks for your patch. Move calling appendClientIpToCallerContext() to RouterRpcServer may solve the issue, but we have to add appendClientIpToCallerContext() to every rpc interfaces in RouterRpcServer. It seems to be more invasive to origin code,what do you think? > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch, HDFS-15630.004.patch, HDFS-15630.test.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17217248#comment-17217248 ] Chengwei Wang commented on HDFS-15630: -- [~elgoiri], thanks for your comment. I think that split appendClientIpToCallerContext() into two functions maybe better. We should check whether the context contains current client ip info first, and then add client ip info to context if needed. What do you think? > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch, HDFS-15630.004.patch, HDFS-15630.test.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17216410#comment-17216410 ] Chengwei Wang commented on HDFS-15630: -- pending jenkins. > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch, HDFS-15630.004.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17216269#comment-17216269 ] Chengwei Wang commented on HDFS-15630: -- Submit patch v004. [~elgoiri] [~ferhui],could you please take a look? > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch, HDFS-15630.004.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Attachment: HDFS-15630.004.patch > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch, HDFS-15630.004.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17216160#comment-17216160 ] Chengwei Wang commented on HDFS-15630: -- [~ferhui], [~elgoiri], thanks for the comments. {quote}Client ip will not be null or empty? {quote} It makes no sense if the client ip is null or empty, it won't be added to CallerContext as it would be regarded as invalid, so I think it's better to fail-fast. {quote}context contains clientIp:x.x.x.x because the callercontext added clientip is used again? If the param callercontext is the original one remote client sends, we use that and it will not be a problem? {quote} Yes, it would be a problem if the client ip string already exsits when we only check the ip string, so the check condition should change as context.contains("clientIp:x.x.x.x"). Then it would be ok to skip to add the duplicated client ip to context even if the ip was added by other clients, because we could also get correct clinet ip in server side which would meet our needs. What do you think? {quote}Why remove the assert? Feel strange context is not null before we set. {quote} I removed the assert because I set caller context in the new unit test, which would lead to this assert failed. To avoid confusion,I would recover TestRouterRpc#testMkdirsWithCallerContext(), and do all the checks in the new UT. {quote}Not sure that Whether it is expected when a machine has more eth interfaces {quote} It would fail if some other infos added to the caller context, so it'better to parse out the client ip info first,and check the ip then. {quote}Let's put the full javadoc for the old invokeMethod signature and just add "Set clientIp and callerContext get from server context." at the end of the description. {quote} Ok, I would add javadoc as you said. > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17215132#comment-17215132 ] Chengwei Wang commented on HDFS-15630: -- Submit patch v003 to fix the java doc and UT. [~elgoiri] [~ferhui], could you help to take a look again? > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Attachment: HDFS-15630.003.patch > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17214577#comment-17214577 ] Chengwei Wang commented on HDFS-15630: -- [~elgoiri] Thanks for the review. {quote}To avoid churn, let's keep the all method signature at the beginning and add the new one afterwards. Let's also have a javadoc for both; in the old one just add a comment saying we take the context from the server. {quote} I will add doc to the old method and adjust its position. {quote}For the test in TestRouterRpc, can we actually check that is not null and preferably something more specific? {quote} Did you mean that we should check the specified value for CallerContext in TestRouterRpc#testMkdirsWithCallerContext() like this: {code:java} String expectContext = "callerContext=clientContext,clientIp:" + InetAddress.getLocalHost().getHostAddress(); // Assert the caller context is correct at client side assertEquals(expectContext, CallerContext.getCurrent().getContext()); // Assert the caller context transfer to server side correctly for (String line : auditlog.getOutput().split("\n")) { if (line.contains("src=" + dirPath)) { assertTrue(line.trim().endsWith(expectContext)); } } {code} > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17214376#comment-17214376 ] Chengwei Wang commented on HDFS-15630: -- [~ferhui] thanks for review. Submit patch v002 [^HDFS-15630.002.patch] to fix the related failed UT. And I will try to create a github PR later. > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Attachment: HDFS-15630.002.patch > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213758#comment-17213758 ] Chengwei Wang commented on HDFS-15630: -- Submit patch v001. Hi [~ferhui], could you please help to review? > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Attachment: HDFS-15630.001.patch Status: Patch Available (was: Open) > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15630.001.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Summary: RBF: Fix wrong client IP info in CallerContext when requests mount points with (was: RBF: fix wrong client IP info in CallerContext) > RBF: Fix wrong client IP info in CallerContext when requests mount points > with > --- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Summary: RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations. (was: RBF: Fix wrong client IP info in CallerContext when requests mount points with ) > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15630) RBF: fix wrong client IP info in CallerContext
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213042#comment-17213042 ] Chengwei Wang commented on HDFS-15630: -- As reported in HDFS-13293, when we try to request mount points with multi-destinations, the client IP in CallerContext would be wrong. * the clientIp would duplicate in CallerContext when RouterRpcClient#invokeSequential. {quote}2020-10-13 14:46:17,084 [IPC Server handler 1 on default port 14269] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo src=/test_dir_with_callercontext dst=null perm=null proto=rpc callerContext=clientContext,clientIp:10.145.43.214,clientIp:10.145.43.214 {quote} * the clientIp would miss in CallerContext when RouterRpcClient#invokeConcurrent. {quote}2020-10-13 14:46:18,305 [IPC Server handler 0 on default port 14269] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=listStatus src=/ dst=null perm=null proto=rpc callerContext=clientContext {quote} I would try to fix it soon. > RBF: fix wrong client IP info in CallerContext > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: fix wrong client IP info in CallerContext
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Summary: RBF: fix wrong client IP info in CallerContext (was: RBF: wrong client IP info in CallerContext when requests mount points with multiple destinations.) > RBF: fix wrong client IP info in CallerContext > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer
[ https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213026#comment-17213026 ] Chengwei Wang commented on HDFS-13293: -- Hi [~ferhui], I have filed a new jira HDFS-15630 and would try to fix the issue soon. > RBF: The RouterRPCServer should transfer client IP via CallerContext to > NamenodeRpcServer > - > > Key: HDFS-13293 > URL: https://issues.apache.org/jira/browse/HDFS-13293 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: maobaolong >Assignee: Hui Fei >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HDFS-13293.001.patch > > Time Spent: 4h 40m > Remaining Estimate: 0h > > Otherwise, the namenode don't know the client's callerContext > This jira focuses on audit log which logs real client ip. Leave locality to > HDFS-13248 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: wrong client IP info in CallerContext when requests mount points with multiple destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Description: There are two issues about client IP info in CallerContext when we try to request mount points with multi-destinations. # the clientIp would duplicate in CallerContext when RouterRpcClient#invokeSequential. # the clientIp would miss in CallerContext when RouterRpcClient#invokeConcurrent. was: There are two issues about client IP info in CallerContext when we try to mount points with multi-destinations. # the clientIp would duplicate in CallerContext when RouterRpcClient#invokeSequential. # the clientIp would miss in CallerContext when RouterRpcClient#invokeConcurrent. > RBF: wrong client IP info in CallerContext when requests mount points with > multiple destinations. > - > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: wrong client IP info in CallerContext when requests mount points with multiple destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Description: There are two issues about client IP info in CallerContext when we try to mount points with multi-destinations. # the clientIp would duplicate in CallerContext when RouterRpcClient#invokeSequential. # the clientIp would miss in CallerContext when RouterRpcClient#invokeConcurrent. was:There are two issues about when a mount table contains multi-destinations. > RBF: wrong client IP info in CallerContext when requests mount points with > multiple destinations. > - > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > There are two issues about client IP info in CallerContext when we try to > mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: wrong client IP info in CallerContext when requests mount points with multiple destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Description: There are two issues about when a mount table contains multi-destinations. (was: The ) > RBF: wrong client IP info in CallerContext when requests mount points with > multiple destinations. > - > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > There are two issues about when a mount table contains multi-destinations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: wrong client IP info in CallerContext when requests mount points with multiple destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Summary: RBF: wrong client IP info in CallerContext when requests mount points with multiple destinations. (was: RBF: wrong client IP info in CallerContext when requests mount point with multiple destinations.) > RBF: wrong client IP info in CallerContext when requests mount points with > multiple destinations. > - > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > The -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: wrong client IP info in CallerContext when requests mount point with multiple destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15630: - Description: The > RBF: wrong client IP info in CallerContext when requests mount point with > multiple destinations. > > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > > The -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15630) RBF: wrong client IP info in CallerContext when requests mount point with multiple destinations.
Chengwei Wang created HDFS-15630: Summary: RBF: wrong client IP info in CallerContext when requests mount point with multiple destinations. Key: HDFS-15630 URL: https://issues.apache.org/jira/browse/HDFS-15630 Project: Hadoop HDFS Issue Type: Bug Components: rbf Reporter: Chengwei Wang Assignee: Chengwei Wang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer
[ https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17212912#comment-17212912 ] Chengwei Wang edited comment on HDFS-13293 at 10/13/20, 8:06 AM: - Hi [~ferhui], did you test this patch while a mount table with multi-destinations? In my test cases, there were two issues when a mount table contains multi-destinations. 1. The clientIp duplicated in CallerContext when RouterRpcClient#invokeSequential. It caused by that current thread would append the clientIp to current CallerContext when it called RouterRpcClient#invokeMethod every time. {code:java} 2020-10-13 14:46:17,083 [IPC Server handler 6 on default port 13189] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo src=/test_dir_with_callercontextdst=nullperm=null proto=rpc callerContext=clientContext,clientIp:10.145.43.214 2020-10-13 14:46:17,084 [IPC Server handler 1 on default port 14269] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo src=/test_dir_with_callercontextdst=nullperm=null proto=rpc callerContext=clientContext,clientIp:10.145.43.214,clientIp:10.145.43.214{code} 2. the clientIp missed in CallerContext when RouterRpcClient#invokeConcurrent. It caused by that RouterRpcClient#invokeConcurrent used a ThreadPoolExecutor to execute request concurrent, however the worker threads of ThreadPoolExecutor can't get clientIp by Server#getRemoteAddress (thread local value only initialized of current thread). {code:java} 2020-10-13 14:46:18,305 [IPC Server handler 7 on default port 13189] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=listStatus src=/ dst=null perm=null proto=rpc callerContext=clientContext 2020-10-13 14:46:18,305 [IPC Server handler 0 on default port 14269] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=listStatus src=/ dst=null perm=null proto=rpc callerContext=clientContext {code} Please help to check the issues, and I would try to fix these if necessary. was (Author: smarthan): Hi [~ferhui], did you test this patch while a mount table with multi-destinations? In my test cases, there were two issues when a mount table contains multi-destinations. 1. The clientIp duplicated in CallerContext when RouterRpcClient#invokeSequential. It caused by that current thread would append the clientIp to current CallerContext when it called RouterRpcClient#invokeMethod every time. {code:java} 2020-10-13 14:46:17,083 [IPC Server handler 6 on default port 13189] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo src=/test_dir_with_callercontextdst=nullperm=null proto=rpc callerContext=clientContext,clientIp:10.145.43.214 2020-10-13 14:46:17,084 [IPC Server handler 1 on default port 14269] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo src=/test_dir_with_callercontextdst=nullperm=null proto=rpc callerContext=clientContext,clientIp:10.145.43.214,clientIp:10.145.43.214{code} 2. the clientIp missed in CallerContext when RouterRpcClient#invokeConcurrent. It caused by that RouterRpcClient#invokeConcurrent used a ThreadPoolExecutor to execute request concurrent, however the worker threads of ThreadPoolExecutor can't get clientIp by Server#getRemoteAddress (thread local value only initialized of current thread). {code:java} 2020-10-13 14:46:18,305 [IPC Server handler 7 on default port 13189] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=listStatus src=/ dst=null perm=null proto=rpc callerContext=clientContext 2020-10-13 14:46:18,305 [IPC Server handler 0 on default port 14269] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=listStatus src=/ dst=null perm=null proto=rpc callerContext=clientContext {code} Please help to check the issues, and I would try to fix these if necessary. > RBF: The RouterRPCServer should transfer client IP via CallerContext to > NamenodeRpcServer > - > > Key: HDFS-13293 > URL: https://issues.apache.org/jira
[jira] [Commented] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer
[ https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17212912#comment-17212912 ] Chengwei Wang commented on HDFS-13293: -- Hi [~ferhui], did you test this patch while a mount table with multi-destinations? In my test cases, there were two issues when a mount table contains multi-destinations. 1. The clientIp duplicated in CallerContext when RouterRpcClient#invokeSequential. It caused by that current thread would append the clientIp to current CallerContext when it called RouterRpcClient#invokeMethod every time. {code:java} 2020-10-13 14:46:17,083 [IPC Server handler 6 on default port 13189] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo src=/test_dir_with_callercontextdst=nullperm=null proto=rpc callerContext=clientContext,clientIp:10.145.43.214 2020-10-13 14:46:17,084 [IPC Server handler 1 on default port 14269] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo src=/test_dir_with_callercontextdst=nullperm=null proto=rpc callerContext=clientContext,clientIp:10.145.43.214,clientIp:10.145.43.214{code} 2. the clientIp missed in CallerContext when RouterRpcClient#invokeConcurrent. It caused by that RouterRpcClient#invokeConcurrent used a ThreadPoolExecutor to execute request concurrent, however the worker threads of ThreadPoolExecutor can't get clientIp by Server#getRemoteAddress (thread local value only initialized of current thread). {code:java} 2020-10-13 14:46:18,305 [IPC Server handler 7 on default port 13189] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=listStatus src=/ dst=null perm=null proto=rpc callerContext=clientContext 2020-10-13 14:46:18,305 [IPC Server handler 0 on default port 14269] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8680)) - allowed=true ugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=listStatus src=/ dst=null perm=null proto=rpc callerContext=clientContext {code} Please help to check the issues, and I would try to fix these if necessary. > RBF: The RouterRPCServer should transfer client IP via CallerContext to > NamenodeRpcServer > - > > Key: HDFS-13293 > URL: https://issues.apache.org/jira/browse/HDFS-13293 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: maobaolong >Assignee: Hui Fei >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HDFS-13293.001.patch > > Time Spent: 4h 40m > Remaining Estimate: 0h > > Otherwise, the namenode don't know the client's callerContext > This jira focuses on audit log which logs real client ip. Leave locality to > HDFS-13248 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15493: - Description: While loading INodeDirectorySection of fsimage, it will update name cache and block map after added inode file to inode directory. It would reduce time cost of fsimage loading to enable these steps run in parallel. In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost reduce to 410s. was: While loading INodeDirectorySection of fsimage, it will update name cache and block map after added inode file to inode directory. It would reduce time cost of fsimage loading to enable these steps run in parallel. In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost reduc to 410s. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, HDFS-15493.007.patch, HDFS-15493.008.patch, > fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduce to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175932#comment-17175932 ] Chengwei Wang commented on HDFS-15493: -- Thank you [~sodonnell], it's my pleasure to be a contributor of HDFS project :D > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, HDFS-15493.007.patch, HDFS-15493.008.patch, > fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175422#comment-17175422 ] Chengwei Wang edited comment on HDFS-15493 at 8/11/20, 10:00 AM: - Submit patch v007 [^HDFS-15493.007.patch]. Hi [~sodonnell], thanks for your advice. I have removed the blank line and added the unit test which worked in my test. Please help to review this patch. was (Author: smarthan): Submit patch v007 [^HDFS-15493.007.patch]. Hi [~sodonnell], thanks for your advice. I have removed the blank line and added the unit test which worked in my test. Please help to review this patch. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, HDFS-15493.007.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175422#comment-17175422 ] Chengwei Wang commented on HDFS-15493: -- Submit patch v007 [^HDFS-15493.007.patch]. Hi [~sodonnell], thanks for your advice. I have removed the blank line and added the unit test which worked in my test. Please help to review this patch. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, HDFS-15493.007.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15493: - Attachment: HDFS-15493.007.patch > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, HDFS-15493.007.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15493: - Attachment: HDFS-15493.006.patch > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174042#comment-17174042 ] Chengwei Wang commented on HDFS-15493: -- Submit patch v006 [^HDFS-15493.006.patch] Removed the redundant `fillUpInodeList(...)` call and fixed the problems about checkstyle and JavaDoc. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174019#comment-17174019 ] Chengwei Wang commented on HDFS-15493: -- Hi [~sodonnell], thanks for your detailed review. {quote}Before this change, there may have been some inodes where were only referred to in references, but those INodes must be in the inode section of the image. Therefore we will handle all inodes mentioned in the references already with the change in `loadINodesInSection(...)` and hence should remove the `fillUpInodeList(...)` call below - do you think that is correct? {quote} I think you are absolutely right, the INodeReference was constructed with an inode get by inodeMap which had been loaded during `loadINodesInSection(...)` , so it makes no sense to call `fillUpInodeList(...)` again. I will remove the redundant call and fix the problems about checkstyle and JavaDoc. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173604#comment-17173604 ] Chengwei Wang commented on HDFS-15493: -- Hi [~sodonnell], can you help to review this new patch? > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169672#comment-17169672 ] Chengwei Wang commented on HDFS-15493: -- Submit a new patch [^HDFS-15493.005.patch] Removed the switch of this feature and simplified some related code. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15493: - Attachment: HDFS-15493.005.patch > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169665#comment-17169665 ] Chengwei Wang commented on HDFS-15493: -- Hi [~sodonnell], thanks for your review. {quote}I think we should create a new patch where the feature cannot be enabled / disabled - just have it always on, as I cannot think of a good reason someone should turn it off, and it will make the code simpler if we just remove the switch. What do you think? {quote} I agree with you. It's better to remove the switch of this feature. I will remove it and submit a new patch soon. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168635#comment-17168635 ] Chengwei Wang edited comment on HDFS-15493 at 7/31/20, 11:06 AM: - After reviewed code about update blocks map and name cache carefully,I found that it's feasible to start to do these when started loading INodeSection, and shutdown the executors when completed loading INodeDirectorySection. So that, it taken almost no time cost to wait executor terminated. Submit a patch [^HDFS-15493.004.patch] base on this means. It uses two single thread executors and updates without lock. Tested this patch twice. {code:java} Test1. 20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed update blocks map and name cache, total waiting duration: 1 20/07/31 18:27:51 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 367 seconds. Test2. 20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed update blocks map and name cache, total waiting duration: 1 20/07/31 18:48:04 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 363 seconds.{code} It takes about 20% speed up base my tests and reduces the time cost from 460s+ to 360s+. I think this patch may be the best choice, [~sodonnell] can you help me test it on trunk. was (Author: smarthan): After reviewed code about update blocks map and name cache carefully,I found that it's feasible to start to do these when started loading INodeSection, and shutdown the executors when completed loading INodeDirectorySection. So that, it taken almost no time cost to wait executor terminated. Submit a patch [^HDFS-15493.004.patch] base on this means. It uses two single thread executors and updates without lock. Tested this patch twice. {code:java} Test1. 20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed update blocks map and name cache, total waiting duration: 1 20/07/31 18:27:51 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 367 seconds. Test2. 20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed update blocks map and name cache, total waiting duration: 1 20/07/31 18:48:04 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 363 seconds.{code} It takes about 20% speed up base my tests and reduces the time cost from 460s+ to 360s+. I think this patch may be the best choice, [~sodonnell] can you help me test it on trunk. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168635#comment-17168635 ] Chengwei Wang commented on HDFS-15493: -- After reviewed code about update blocks map and name cache carefully,I found that it's feasible to start to do these when started loading INodeSection, and shutdown the executors when completed loading INodeDirectorySection. So that, it taken almost no time cost to wait executor terminated. Submit a patch [^HDFS-15493.004.patch] base on this means. It uses two single thread executors and updates without lock. Tested this patch twice. {code:java} Test1. 20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed update blocks map and name cache, total waiting duration: 1 20/07/31 18:27:51 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 367 seconds. Test2. 20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed update blocks map and name cache, total waiting duration: 1 20/07/31 18:48:04 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 363 seconds.{code} It takes about 20% speed up base my tests and reduces the time cost from 460s+ to 360s+. I think this patch may be the best choice, [~sodonnell] can you help me test it on trunk. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengwei Wang updated HDFS-15493: - Attachment: HDFS-15493.004.patch > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org