[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143504#comment-17143504 ] Chen Liang commented on HDFS-15421: --- Thanks for reporting [~kihwal] and thanks [~aajisaka] working on this! Good catch on the missing updates, the change looks good to me. > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Akira Ajisaka >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch, HDFS-15421-001.patch, > HDFS-15421.002.patch, HDFS-15421.003.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143502#comment-17143502 ] Akira Ajisaka commented on HDFS-15421: -- Thanks [~shv] for your review and suggestion. Merged the test files. > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Akira Ajisaka >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch, HDFS-15421-001.patch, > HDFS-15421.002.patch, HDFS-15421.003.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-15421: - Attachment: HDFS-15421.003.patch > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Akira Ajisaka >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch, HDFS-15421-001.patch, > HDFS-15421.002.patch, HDFS-15421.003.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13248) RBF: Namenode need to choose block location for the client
[ https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jianghua zhu updated HDFS-13248: Description: When execute a put operation via router, the NameNode will choose block location for the router, not for the real client. This will affect the file's locality. I think on both NameNode and Router, we should add a new addBlock method, or add a parameter for the current addBlock method, to pass the real client information. was:NegativeArraySizeException when PROVIDED replication >1 > RBF: Namenode need to choose block location for the client > -- > > Key: HDFS-13248 > URL: https://issues.apache.org/jira/browse/HDFS-13248 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Wu Weiwei >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, > HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, > HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality > Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg > > > When execute a put operation via router, the NameNode will choose block > location for the router, not for the real client. This will affect the file's > locality. > I think on both NameNode and Router, we should add a new addBlock method, or > add a parameter for the current addBlock method, to pass the real client > information. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13248) RBF: Namenode need to choose block location for the client
[ https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jianghua zhu updated HDFS-13248: Description: NegativeArraySizeException when PROVIDED replication >1 (was: When execute a put operation via router, the NameNode will choose block location for the router, not for the real client. This will affect the file's locality. I think on both NameNode and Router, we should add a new addBlock method, or add a parameter for the current addBlock method, to pass the real client information.) > RBF: Namenode need to choose block location for the client > -- > > Key: HDFS-13248 > URL: https://issues.apache.org/jira/browse/HDFS-13248 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Wu Weiwei >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, > HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, > HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality > Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg > > > NegativeArraySizeException when PROVIDED replication >1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15416) DataStorage#addStorageLocations() should add more reasonable information verification.
[ https://issues.apache.org/jira/browse/HDFS-15416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143483#comment-17143483 ] jianghua zhu commented on HDFS-15416: - [~elgoiri] , thank you very much for your suggestions. I have submitted a new patch file, and I have modified some source code. > DataStorage#addStorageLocations() should add more reasonable information > verification. > -- > > Key: HDFS-15416 > URL: https://issues.apache.org/jira/browse/HDFS-15416 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.1.0, 3.1.1 >Reporter: jianghua zhu >Assignee: jianghua zhu >Priority: Major > Attachments: HDFS-15416.000.patch, HDFS-15416.001.patch > > > SuccessLocations content is an array, when the number is 0, do not need to be > executed again loadBlockPoolSliceStorage (). > code : > try > { > final List successLocations = loadDataStorage( datanode, > nsInfo, dataDirs, startOpt, executor); > return loadBlockPoolSliceStorage( datanode, nsInfo, successLocations, > startOpt, executor); } > finally > { executor.shutdown(); } > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15416) DataStorage#addStorageLocations() should add more reasonable information verification.
[ https://issues.apache.org/jira/browse/HDFS-15416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jianghua zhu updated HDFS-15416: Attachment: HDFS-15416.001.patch Status: Patch Available (was: In Progress) > DataStorage#addStorageLocations() should add more reasonable information > verification. > -- > > Key: HDFS-15416 > URL: https://issues.apache.org/jira/browse/HDFS-15416 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.1.1, 3.1.0 >Reporter: jianghua zhu >Assignee: jianghua zhu >Priority: Major > Attachments: HDFS-15416.000.patch, HDFS-15416.001.patch > > > SuccessLocations content is an array, when the number is 0, do not need to be > executed again loadBlockPoolSliceStorage (). > code : > try > { > final List successLocations = loadDataStorage( datanode, > nsInfo, dataDirs, startOpt, executor); > return loadBlockPoolSliceStorage( datanode, nsInfo, successLocations, > startOpt, executor); } > finally > { executor.shutdown(); } > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15425) Review Logging of DFSClient
[ https://issues.apache.org/jira/browse/HDFS-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143456#comment-17143456 ] Hongbing Wang commented on HDFS-15425: -- [~elgoiri] Thanks for your review! > Review Logging of DFSClient > --- > > Key: HDFS-15425 > URL: https://issues.apache.org/jira/browse/HDFS-15425 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hongbing Wang >Assignee: Hongbing Wang >Priority: Minor > Attachments: HDFS-15425.001.patch, HDFS-15425.002.patch > > > Review use of SLF4J for DFSClient.LOG. > Make the code more concise and readable. > Less is more ! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143435#comment-17143435 ] liusheng edited comment on HDFS-15098 at 6/24/20, 1:30 AM: --- Hi [~weichiu], I am so sorry that we have a delay for this feature, now we have updated the patches and tested OK locally, we have added test cases, config options, docs in the patch. currently, the SM4 is supported in openssl>=1.1.1, if this requirement is unstatisfied, it will fall back to use the SM4 implementation of BouncyCastleProvider which is already a dependency of Hadoop. So, now we only need to cofigure KMS services to enable SM4 support. Could you please help to review again ? was (Author: seanlau): Hi [~weichiu], I am so sorry that we have a delay for this feature, now we have updated the patches and tested OK locally, we have added test cases, config options, docs in the patch. currently, the SM4 is supported in openssl>=1.1.1, if this requirement is unstatisfied, it will fall back to use the SM4 implementation BouncyCastleProvider which is already a dependency of Hadoop. So, now we only need to cofigure KMS services to enable SM4 support. Could you please help to review again ? > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: zZtai >Priority: Major > Labels: sm4 > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch > > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.download Bouncy Castle Crypto APIs from bouncycastle.org > [https://bouncycastle.org/download/bcprov-ext-jdk15on-165.jar] > 2.Configure JDK > Place bcprov-ext-jdk15on-165.jar in $JAVA_HOME/jre/lib/ext directory, > add "security.provider.10=org.bouncycastle.jce.provider.BouncyCastleProvider" > to $JAVA_HOME/jre/lib/security/java.security file > 3.Configure Hadoop KMS > 4.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 > 2.configure Bouncy Castle Crypto on JDK -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143435#comment-17143435 ] liusheng commented on HDFS-15098: - Hi [~weichiu], I am so sorry that we have a delay for this feature, now we have updated the patches and tested OK locally, we have added test cases, config options, docs in the patch. currently, the SM4 is supported in openssl>=1.1.1, if this requirement is unstatisfied, it will fall back to use the SM4 implementation BouncyCastleProvider which is already a dependency of Hadoop. So, now we only need to cofigure KMS services to enable SM4 support. Could you please help to review again ? > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: zZtai >Priority: Major > Labels: sm4 > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch > > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.download Bouncy Castle Crypto APIs from bouncycastle.org > [https://bouncycastle.org/download/bcprov-ext-jdk15on-165.jar] > 2.Configure JDK > Place bcprov-ext-jdk15on-165.jar in $JAVA_HOME/jre/lib/ext directory, > add "security.provider.10=org.bouncycastle.jce.provider.BouncyCastleProvider" > to $JAVA_HOME/jre/lib/security/java.security file > 3.Configure Hadoop KMS > 4.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 > 2.configure Bouncy Castle Crypto on JDK -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15383) RBF: Disable watch in ZKDelegationSecretManager for performance
[ https://issues.apache.org/jira/browse/HDFS-15383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143430#comment-17143430 ] Fengnan Li commented on HDFS-15383: --- Thanks! [~elgoiri] [~hexiaoqiao] > RBF: Disable watch in ZKDelegationSecretManager for performance > --- > > Key: HDFS-15383 > URL: https://issues.apache.org/jira/browse/HDFS-15383 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Major > Fix For: 3.4.0 > > > Based on the current design for delegation token in secure Router, the total > number of watches for tokens is the product of number of routers and number > of tokens, this is due to ZKDelegationTokenManager is using PathChildrenCache > from curator, which automatically sets the watch and ZK will push the sync > information to each router. There are some evaluations about the number of > watches in Zookeeper has negative performance impact to Zookeeper server. > In our practice when the number of watches exceeds 1.2 Million in a single ZK > server there will be significant ZK performance degradation. Thus this ticket > is to rewrite ZKDelegationTokenManagerImpl.java to explicitly disable the > PathChildrenCache and have Routers sync periodically from Zookeeper. This has > been working fine at the scale of 10 Routers with 2 million tokens. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15429) mkdirs should work when parent dir is internalDir and fallback configured.
[ https://issues.apache.org/jira/browse/HDFS-15429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15429: --- Status: Patch Available (was: Open) Updated a PR for review! > mkdirs should work when parent dir is internalDir and fallback configured. > -- > > Key: HDFS-15429 > URL: https://issues.apache.org/jira/browse/HDFS-15429 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.21 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > > mkdir will not work if the parent dir is Internal mount dir (non leaf in > mount path) and fall back configured. > Since fallback is available and if same tree structure available in fallback, > we should be able to mkdir in fallback. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143331#comment-17143331 ] Kihwal Lee commented on HDFS-15421: --- Thanks, [~aajisaka] for the patch. I will also have a look soon. > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Akira Ajisaka >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch, HDFS-15421-001.patch, > HDFS-15421.002.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143296#comment-17143296 ] Konstantin Shvachko commented on HDFS-15421: Good catch [~kihwal], thanks for debugging this. [~aajisaka] thanks for the patch. Clearly HDFS-14941 missed some append and truncate cases, which update blocks with new genStamp while tailing. Took a look at v02 patch. It seems you caught correctly all other cases of block updates during tailing. Would be good if [~vagarychen] could take a look as well. One suggestion for tests is to move all test cases into {{TestAddBlockTailing}} if possible, potentially renaming it to something like {{TestUpdateBlockTailing}}. The two new tests have a lot of code similarities with {{TestAddBlockTailing}. And if merged will avoid extra MiniCluster startups, making tests run faster. > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Akira Ajisaka >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch, HDFS-15421-001.patch, > HDFS-15421.002.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15383) RBF: Disable watch in ZKDelegationSecretManager for performance
[ https://issues.apache.org/jira/browse/HDFS-15383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143278#comment-17143278 ] Hudson commented on HDFS-15383: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18377 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18377/]) HDFS-15383. RBF: Add support for router delegation token without watch (github: rev 84110d850e2bc2a9ff4afcc7508fecd81cb5b7e5) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/security/token/ZKDelegationTokenSecretManagerImpl.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java * (add) hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/security/token/TestZKDelegationTokenSecretManagerImpl.java * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/token/delegation/TestZKDelegationTokenSecretManager.java > RBF: Disable watch in ZKDelegationSecretManager for performance > --- > > Key: HDFS-15383 > URL: https://issues.apache.org/jira/browse/HDFS-15383 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Major > Fix For: 3.4.0 > > > Based on the current design for delegation token in secure Router, the total > number of watches for tokens is the product of number of routers and number > of tokens, this is due to ZKDelegationTokenManager is using PathChildrenCache > from curator, which automatically sets the watch and ZK will push the sync > information to each router. There are some evaluations about the number of > watches in Zookeeper has negative performance impact to Zookeeper server. > In our practice when the number of watches exceeds 1.2 Million in a single ZK > server there will be significant ZK performance degradation. Thus this ticket > is to rewrite ZKDelegationTokenManagerImpl.java to explicitly disable the > PathChildrenCache and have Routers sync periodically from Zookeeper. This has > been working fine at the scale of 10 Routers with 2 million tokens. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15416) DataStorage#addStorageLocations() should add more reasonable information verification.
[ https://issues.apache.org/jira/browse/HDFS-15416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143276#comment-17143276 ] Íñigo Goiri commented on HDFS-15416: Thanks [~jianghuazhu] for the update. Please for tracking keep adding patches with the sequence number. For the test, the content shouldn't have a javadoc style comment but a regular one. > DataStorage#addStorageLocations() should add more reasonable information > verification. > -- > > Key: HDFS-15416 > URL: https://issues.apache.org/jira/browse/HDFS-15416 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.1.0, 3.1.1 >Reporter: jianghua zhu >Assignee: jianghua zhu >Priority: Major > Attachments: HDFS-15416.000.patch > > > SuccessLocations content is an array, when the number is 0, do not need to be > executed again loadBlockPoolSliceStorage (). > code : > try > { > final List successLocations = loadDataStorage( datanode, > nsInfo, dataDirs, startOpt, executor); > return loadBlockPoolSliceStorage( datanode, nsInfo, successLocations, > startOpt, executor); } > finally > { executor.shutdown(); } > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15383) RBF: Disable watch in ZKDelegationSecretManager for performance
[ https://issues.apache.org/jira/browse/HDFS-15383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143268#comment-17143268 ] Íñigo Goiri commented on HDFS-15383: Thanks [~fengnanli] for the patch and [~hexiaoqiao] for the review. Merged the PR. > RBF: Disable watch in ZKDelegationSecretManager for performance > --- > > Key: HDFS-15383 > URL: https://issues.apache.org/jira/browse/HDFS-15383 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Major > Fix For: 3.4.0 > > > Based on the current design for delegation token in secure Router, the total > number of watches for tokens is the product of number of routers and number > of tokens, this is due to ZKDelegationTokenManager is using PathChildrenCache > from curator, which automatically sets the watch and ZK will push the sync > information to each router. There are some evaluations about the number of > watches in Zookeeper has negative performance impact to Zookeeper server. > In our practice when the number of watches exceeds 1.2 Million in a single ZK > server there will be significant ZK performance degradation. Thus this ticket > is to rewrite ZKDelegationTokenManagerImpl.java to explicitly disable the > PathChildrenCache and have Routers sync periodically from Zookeeper. This has > been working fine at the scale of 10 Routers with 2 million tokens. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15383) RBF: Disable watch in ZKDelegationSecretManager for performance
[ https://issues.apache.org/jira/browse/HDFS-15383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri resolved HDFS-15383. Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Disable watch in ZKDelegationSecretManager for performance > --- > > Key: HDFS-15383 > URL: https://issues.apache.org/jira/browse/HDFS-15383 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Major > Fix For: 3.4.0 > > > Based on the current design for delegation token in secure Router, the total > number of watches for tokens is the product of number of routers and number > of tokens, this is due to ZKDelegationTokenManager is using PathChildrenCache > from curator, which automatically sets the watch and ZK will push the sync > information to each router. There are some evaluations about the number of > watches in Zookeeper has negative performance impact to Zookeeper server. > In our practice when the number of watches exceeds 1.2 Million in a single ZK > server there will be significant ZK performance degradation. Thus this ticket > is to rewrite ZKDelegationTokenManagerImpl.java to explicitly disable the > PathChildrenCache and have Routers sync periodically from Zookeeper. This has > been working fine at the scale of 10 Routers with 2 million tokens. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15425) Review Logging of DFSClient
[ https://issues.apache.org/jira/browse/HDFS-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143165#comment-17143165 ] Íñigo Goiri commented on HDFS-15425: [^HDFS-15425.002.patch] looks safer. We probably should fix the checkstyle though. > Review Logging of DFSClient > --- > > Key: HDFS-15425 > URL: https://issues.apache.org/jira/browse/HDFS-15425 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hongbing Wang >Assignee: Hongbing Wang >Priority: Minor > Attachments: HDFS-15425.001.patch, HDFS-15425.002.patch > > > Review use of SLF4J for DFSClient.LOG. > Make the code more concise and readable. > Less is more ! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143102#comment-17143102 ] Hadoop QA commented on HDFS-15421: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 3m 0s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 1s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 59s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 24s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 5s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 28s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}187m 46s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.TestReconstructStripedFile | | | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | | hadoop.hdfs.TestRollingUpgrade | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-HDFS-Build/29455/artifact/out/Dockerfile | | JIRA Issue | HDFS-15421 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13006271/HDFS-15421.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 68ab2dd622fb 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 03f855e3e7a | | Default Java | Private
[jira] [Resolved] (HDFS-13510) Ozone: Fix precommit hook for Ozone/Hdds on trunk
[ https://issues.apache.org/jira/browse/HDFS-13510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek resolved HDFS-13510. Resolution: Won't Fix > Ozone: Fix precommit hook for Ozone/Hdds on trunk > - > > Key: HDFS-13510 > URL: https://issues.apache.org/jira/browse/HDFS-13510 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > > Current precommit doesn't work with the ozone projects as they are in an > optional profile. > This jira may not have any code change but I opened it to track the required > changes on builds.apache.org and make the changes more transparent. > I think we need the following changes: > 1. Separated jira subproject, as planned > 2. After that we can create new Precommit-OZONE-Build job which will be > triggered by the PreCommit-Admin (jira filter should be modified) > 3. In the Precommit-OZONE-Build we need to enable the hdds profile. It could > be done by modifying the yetus personality or the create a .mvn/mvn.config > 4. We need the ozone/hdds snapshot artifacts in apache nexus: > a.) One option is adding -P hdds to the Hadoop-trunk-Commit. This is the > simplified but Hdds/Ozone build failure will cause missing artifacts on nexus > (low chance as the merge will be guarded by PreCommit hook) > b.) Other options is to create a Hadoop-Ozone-trunk-Commit which do a full > compilation but only hdds and ozone artifacts will be deployed (some sync > problem maybe here if different core artifacts are uploaded...) > 5. And we also need a daily unit test run. (qbt) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142930#comment-17142930 ] Akira Ajisaka commented on HDFS-15421: -- 002 patch * fixed comments in test > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Akira Ajisaka >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch, HDFS-15421-001.patch, > HDFS-15421.002.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-15421: - Attachment: HDFS-15421.002.patch > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Akira Ajisaka >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch, HDFS-15421-001.patch, > HDFS-15421.002.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142884#comment-17142884 ] Akira Ajisaka edited comment on HDFS-15421 at 6/23/20, 1:33 PM: {quote}I think we need to update genstamp when rolling {{OP_APPEND}}. In {{OP_TRUNCATE}}, it is the same. {quote} This change does not fix the problem for append. When appending a block without {{CreateFlag.NEW_BLOCK}}, the edit log becomes as follows: * {{OP_APPEND}}: prepare for append * {{OP_SET_GENSTAMP_V2}}: update pipeline * (edited) {{OP_UPDATE_BLOCKS}}: update blocks That way SNN will tail {{OP_SET_GENSTAMP_V2}} after {{OP_APPEND}}, so apply impending genstamp in {{OP_APPEND}} does not fix this problem. I'll attach a patch with some regression tests. was (Author: ajisakaa): {quote}I think we need to update genstamp when rolling {{OP_APPEND}}. In {{OP_TRUNCATE}}, it is the same. {quote} This change does not fix the problem for append. When appending a block without {{CreateFlag.NEW_BLOCK}}, the edit log becomes as follows: * {{OP_APPEND}}: prepare for append * {{OP_SET_GENSTAMP_V2}}: update pipeline That way SNN will tail {{OP_SET_GENSTAMP_V2}} after {{OP_APPEND}}, so apply impending genstamp in {{OP_APPEND}} does not fix this problem. I'll attach a patch with some regression tests. > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Akira Ajisaka >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch, HDFS-15421-001.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-15421: - Assignee: Akira Ajisaka Status: Patch Available (was: Open) > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Akira Ajisaka >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch, HDFS-15421-001.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142925#comment-17142925 ] Akira Ajisaka commented on HDFS-15421: -- Attached a 001 patch to update the global genstamp in SBN when tailing {{OP_TRUNCATE}} and {{OP_UPDATE_BLOCKS}} edit logs. Please ignore my previous comments. Sorry for going back and forth. > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch, HDFS-15421-001.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-15421: - Attachment: HDFS-15421-001.patch > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch, HDFS-15421-001.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142892#comment-17142892 ] Akira Ajisaka commented on HDFS-15421: -- I think HDFS-14941 can be reverted because it causes IBR leak not only in append but also in pipeline recovery. Now I'm +1 for the option 3 to fix the edit log race in https://issues.apache.org/jira/browse/HDFS-14941?focusedCommentId=16963371=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16963371 Any thoughts? > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-15421: - Attachment: HDFS-15421-000.patch > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142884#comment-17142884 ] Akira Ajisaka commented on HDFS-15421: -- {quote}I think we need to update genstamp when rolling {{OP_APPEND}}. In {{OP_TRUNCATE}}, it is the same. {quote} This change does not fix the problem for append. When appending a block without {{CreateFlag.NEW_BLOCK}}, the edit log becomes as follows: * {{OP_APPEND}}: prepare for append * {{OP_SET_GENSTAMP_V2}}: update pipeline That way SNN will tail {{OP_SET_GENSTAMP_V2}} after {{OP_APPEND}}, so apply impending genstamp in {{OP_APPEND}} does not fix this problem. I'll attach a patch with some regression tests. > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-15421-000.patch > > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15425) Review Logging of DFSClient
[ https://issues.apache.org/jira/browse/HDFS-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142850#comment-17142850 ] Hadoop QA commented on HDFS-15425: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 30m 53s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 12s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 2m 51s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 48s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 22s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-client: The patch generated 1 new + 61 unchanged - 0 fixed = 62 total (was 61) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 32s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 19s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}105m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-HDFS-Build/29454/artifact/out/Dockerfile | | JIRA Issue | HDFS-15425 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13006247/HDFS-15425.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f542ec76cf44 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / fa14e4bc001 | | Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 | | checkstyle |
[jira] [Commented] (HDFS-15425) Review Logging of DFSClient
[ https://issues.apache.org/jira/browse/HDFS-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142792#comment-17142792 ] Hongbing Wang commented on HDFS-15425: -- I provide a optional version 002.patch. [~elgoiri] Could you help review it? thx! > Review Logging of DFSClient > --- > > Key: HDFS-15425 > URL: https://issues.apache.org/jira/browse/HDFS-15425 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hongbing Wang >Assignee: Hongbing Wang >Priority: Minor > Attachments: HDFS-15425.001.patch, HDFS-15425.002.patch > > > Review use of SLF4J for DFSClient.LOG. > Make the code more concise and readable. > Less is more ! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15425) Review Logging of DFSClient
[ https://issues.apache.org/jira/browse/HDFS-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongbing Wang updated HDFS-15425: - Attachment: HDFS-15425.002.patch > Review Logging of DFSClient > --- > > Key: HDFS-15425 > URL: https://issues.apache.org/jira/browse/HDFS-15425 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hongbing Wang >Assignee: Hongbing Wang >Priority: Minor > Attachments: HDFS-15425.001.patch, HDFS-15425.002.patch > > > Review use of SLF4J for DFSClient.LOG. > Make the code more concise and readable. > Less is more ! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15431) Can not read a opening file after NameNode failover if pipeline recover occured
[ https://issues.apache.org/jira/browse/HDFS-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142772#comment-17142772 ] ludun edited comment on HDFS-15431 at 6/23/20, 9:11 AM: [~surendrasingh], [~hemanthboyina] please check this issue. was (Author: pilchard): [~surendrasingh], [~hemanthboyina] please check the issue. > Can not read a opening file after NameNode failover if pipeline recover > occured > --- > > Key: HDFS-15431 > URL: https://issues.apache.org/jira/browse/HDFS-15431 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: ludun >Priority: Major > > a file with two replications and keep it opening. > first it writes to DN1 and DN2. > {code} > 2020-06-23 14:22:51,379 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-7e434b35-0b10-44fa-9d3b-c3c938f1724d,DISK]] > | DataStreamer.java:1757 > {code} > after DN2 restart, it writes to DN1 and DN3, > {code} > 2020-06-23 14:24:04,559 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK]] > | DataStreamer.java:1757 > {code} > after DN1 restart. it writes to DN3 and DN4. > {code} > 2020-06-23 14:25:21,340 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK], > > DatanodeInfoWithStorage[DN4:25009,DS-5fbb2232-e7c8-4186-8eb9-87a6aff86cef,DISK]] > | DataStreamer.java:1757 > {code} > restart Active NameNode. then try to get the file. > NameNode return locatedblocks with DN1 and DN2. Can not obtain block > Exception occurred. > {code} > 20/06/20 17:57:06 DEBUG hdfs.DFSClient: newInfo = LocatedBlocks{ > fileLength=0 > underConstruction=true > > blocks=[LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; > getBlockSize()=53; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]}] > > lastLocatedBlock=LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; > getBlockSize()=53; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]} > isLastBlockComplete=false} > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15431) Can not read a opening file after NameNode failover if pipeline recover occured
[ https://issues.apache.org/jira/browse/HDFS-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ludun updated HDFS-15431: - Summary: Can not read a opening file after NameNode failover if pipeline recover occured (was: Can not read a opening file after NameNode failover if pipeline recover occuered) > Can not read a opening file after NameNode failover if pipeline recover > occured > --- > > Key: HDFS-15431 > URL: https://issues.apache.org/jira/browse/HDFS-15431 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: ludun >Priority: Major > > a file with two replications and keep it opening. > first it writes to DN1 and DN2. > {code} > 2020-06-23 14:22:51,379 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-7e434b35-0b10-44fa-9d3b-c3c938f1724d,DISK]] > | DataStreamer.java:1757 > {code} > after DN2 restart, it writes to DN1 and DN3, > {code} > 2020-06-23 14:24:04,559 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK]] > | DataStreamer.java:1757 > {code} > after DN1 restart. it writes to DN3 and DN4. > {code} > 2020-06-23 14:25:21,340 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK], > > DatanodeInfoWithStorage[DN4:25009,DS-5fbb2232-e7c8-4186-8eb9-87a6aff86cef,DISK]] > | DataStreamer.java:1757 > {code} > restart Active NameNode. then try to get the file. > NameNode return locatedblocks with DN1 and DN2. Can not obtain block > Exception occurred. > {code} > 20/06/20 17:57:06 DEBUG hdfs.DFSClient: newInfo = LocatedBlocks{ > fileLength=0 > underConstruction=true > > blocks=[LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; > getBlockSize()=53; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]}] > > lastLocatedBlock=LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; > getBlockSize()=53; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]} > isLastBlockComplete=false} > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15431) Can not read a opening file after NameNode failover if pipeline recover occuered
[ https://issues.apache.org/jira/browse/HDFS-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142772#comment-17142772 ] ludun commented on HDFS-15431: -- [~surendrasingh] [~hemanthboyina] please check the issue. > Can not read a opening file after NameNode failover if pipeline recover > occuered > > > Key: HDFS-15431 > URL: https://issues.apache.org/jira/browse/HDFS-15431 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: ludun >Priority: Major > > a file with two replications and keep it opening. > first it writes to DN1 and DN2. > {code} > 2020-06-23 14:22:51,379 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-7e434b35-0b10-44fa-9d3b-c3c938f1724d,DISK]] > | DataStreamer.java:1757 > {code} > after DN2 restart, it writes to DN1 and DN3, > {code} > 2020-06-23 14:24:04,559 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK]] > | DataStreamer.java:1757 > {code} > after DN1 restart. it writes to DN3 and DN4. > {code} > 2020-06-23 14:25:21,340 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK], > > DatanodeInfoWithStorage[DN4:25009,DS-5fbb2232-e7c8-4186-8eb9-87a6aff86cef,DISK]] > | DataStreamer.java:1757 > {code} > restart Active NameNode. then try to get the file. > NameNode return locatedblocks with DN1 and DN2. Can not obtain block > Exception occurred. > {code} > 20/06/20 17:57:06 DEBUG hdfs.DFSClient: newInfo = LocatedBlocks{ > fileLength=0 > underConstruction=true > > blocks=[LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; > getBlockSize()=53; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]}] > > lastLocatedBlock=LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; > getBlockSize()=53; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]} > isLastBlockComplete=false} > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15431) Can not read a opening file after NameNode failover if pipeline recover occuered
[ https://issues.apache.org/jira/browse/HDFS-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142772#comment-17142772 ] ludun edited comment on HDFS-15431 at 6/23/20, 9:09 AM: [~surendrasingh], [~hemanthboyina] please check the issue. was (Author: pilchard): [~surendrasingh] [~hemanthboyina] please check the issue. > Can not read a opening file after NameNode failover if pipeline recover > occuered > > > Key: HDFS-15431 > URL: https://issues.apache.org/jira/browse/HDFS-15431 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: ludun >Priority: Major > > a file with two replications and keep it opening. > first it writes to DN1 and DN2. > {code} > 2020-06-23 14:22:51,379 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-7e434b35-0b10-44fa-9d3b-c3c938f1724d,DISK]] > | DataStreamer.java:1757 > {code} > after DN2 restart, it writes to DN1 and DN3, > {code} > 2020-06-23 14:24:04,559 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK]] > | DataStreamer.java:1757 > {code} > after DN1 restart. it writes to DN3 and DN4. > {code} > 2020-06-23 14:25:21,340 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK], > > DatanodeInfoWithStorage[DN4:25009,DS-5fbb2232-e7c8-4186-8eb9-87a6aff86cef,DISK]] > | DataStreamer.java:1757 > {code} > restart Active NameNode. then try to get the file. > NameNode return locatedblocks with DN1 and DN2. Can not obtain block > Exception occurred. > {code} > 20/06/20 17:57:06 DEBUG hdfs.DFSClient: newInfo = LocatedBlocks{ > fileLength=0 > underConstruction=true > > blocks=[LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; > getBlockSize()=53; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]}] > > lastLocatedBlock=LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; > getBlockSize()=53; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]} > isLastBlockComplete=false} > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15431) Can not read a opening file after NameNode failover if pipeline recover occuered
[ https://issues.apache.org/jira/browse/HDFS-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142764#comment-17142764 ] ludun commented on HDFS-15431: -- In BlockReceiver we should notifyNamenodeReceivingBlock also when PIPELINE_SETUP_STREAMING_RECOVERY {code} case PIPELINE_SETUP_CREATE: replicaHandler = datanode.data.createRbw(storageType, storageId, block, allowLazyPersist); datanode.notifyNamenodeReceivingBlock( block, replicaHandler.getReplica().getStorageUuid()); break; case PIPELINE_SETUP_STREAMING_RECOVERY: replicaHandler = datanode.data.recoverRbw( block, newGs, minBytesRcvd, maxBytesRcvd); block.setGenerationStamp(newGs); break; {code} so standby namenode can also know the location of new dn. after failerover we can read file normal. > Can not read a opening file after NameNode failover if pipeline recover > occuered > > > Key: HDFS-15431 > URL: https://issues.apache.org/jira/browse/HDFS-15431 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: ludun >Priority: Major > > a file with two replications and keep it opening. > first it writes to DN1 and DN2. > {code} > 2020-06-23 14:22:51,379 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-7e434b35-0b10-44fa-9d3b-c3c938f1724d,DISK]] > | DataStreamer.java:1757 > {code} > after DN2 restart, it writes to DN1 and DN3, > {code} > 2020-06-23 14:24:04,559 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK]] > | DataStreamer.java:1757 > {code} > after DN1 restart. it writes to DN3 and DN4. > {code} > 2020-06-23 14:25:21,340 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK], > > DatanodeInfoWithStorage[DN4:25009,DS-5fbb2232-e7c8-4186-8eb9-87a6aff86cef,DISK]] > | DataStreamer.java:1757 > {code} > restart Active NameNode. then try to get the file. > NameNode return locatedblocks with DN1 and DN2. Can not obtain block > Exception occurred. > {code} > 20/06/20 17:57:06 DEBUG hdfs.DFSClient: newInfo = LocatedBlocks{ > fileLength=0 > underConstruction=true > > blocks=[LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; > getBlockSize()=53; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]}] > > lastLocatedBlock=LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; > getBlockSize()=53; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]} > isLastBlockComplete=false} > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15431) Can not read a opening file after NameNode failover if pipeline recover occuered
[ https://issues.apache.org/jira/browse/HDFS-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ludun updated HDFS-15431: - Description: a file with two replications and keep it opening. first it writes to DN1 and DN2. {code} 2020-06-23 14:22:51,379 | DEBUG | pipeline = [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], DatanodeInfoWithStorage[DN2:25009,DS-7e434b35-0b10-44fa-9d3b-c3c938f1724d,DISK]] | DataStreamer.java:1757 {code} after DN2 restart, it writes to DN1 and DN3, {code} 2020-06-23 14:24:04,559 | DEBUG | pipeline = [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK]] | DataStreamer.java:1757 {code} after DN1 restart. it writes to DN3 and DN4. {code} 2020-06-23 14:25:21,340 | DEBUG | pipeline = [DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK], DatanodeInfoWithStorage[DN4:25009,DS-5fbb2232-e7c8-4186-8eb9-87a6aff86cef,DISK]] | DataStreamer.java:1757 {code} restart Active NameNode. then try to get the file. NameNode return locatedblocks with DN1 and DN2. Can not obtain block Exception occurred. {code} 20/06/20 17:57:06 DEBUG hdfs.DFSClient: newInfo = LocatedBlocks{ fileLength=0 underConstruction=true blocks=[LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; getBlockSize()=53; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]}] lastLocatedBlock=LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; getBlockSize()=53; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]} isLastBlockComplete=false} {code} was: a file with two replications and keep it opening. first it writes to DN1 and DN2. after DN2 restart, it writes to DN1 and DN3, after DN1 restart. it writes to DN3 and DN4. restart Active NameNode. then try to get the file. NameNode return locatedblocks with DN1 and DN2. Can not obtain block Exception occurred. {code} 20/06/20 17:57:06 DEBUG hdfs.DFSClient: newInfo = LocatedBlocks{ fileLength=0 underConstruction=true blocks=[LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; getBlockSize()=53; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]}] lastLocatedBlock=LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; getBlockSize()=53; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]} isLastBlockComplete=false} {code} > Can not read a opening file after NameNode failover if pipeline recover > occuered > > > Key: HDFS-15431 > URL: https://issues.apache.org/jira/browse/HDFS-15431 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: ludun >Priority: Major > > a file with two replications and keep it opening. > first it writes to DN1 and DN2. > {code} > 2020-06-23 14:22:51,379 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN2:25009,DS-7e434b35-0b10-44fa-9d3b-c3c938f1724d,DISK]] > | DataStreamer.java:1757 > {code} > after DN2 restart, it writes to DN1 and DN3, > {code} > 2020-06-23 14:24:04,559 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], > > DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK]] > | DataStreamer.java:1757 > {code} > after DN1 restart. it writes to DN3 and DN4. > {code} > 2020-06-23 14:25:21,340 | DEBUG | pipeline = > [DatanodeInfoWithStorage[DN3:25009,DS-1810c3d5-b6e8-4403-a0fc-071ea6e5489f,DISK], > > DatanodeInfoWithStorage[DN4:25009,DS-5fbb2232-e7c8-4186-8eb9-87a6aff86cef,DISK]] > | DataStreamer.java:1757 > {code} > restart Active NameNode. then try to get the file. > NameNode return locatedblocks with DN1 and DN2. Can not obtain block > Exception occurred. > {code} > 20/06/20 17:57:06 DEBUG hdfs.DFSClient: newInfo = LocatedBlocks{ > fileLength=0 > underConstruction=true > > blocks=[LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; > getBlockSize()=53; corrupt=false; offset=0; >
[jira] [Created] (HDFS-15431) Can not read a opening file after NameNode failover if pipeline recover occuered
ludun created HDFS-15431: Summary: Can not read a opening file after NameNode failover if pipeline recover occuered Key: HDFS-15431 URL: https://issues.apache.org/jira/browse/HDFS-15431 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: ludun a file with two replications and keep it opening. first it writes to DN1 and DN2. after DN2 restart, it writes to DN1 and DN3, after DN1 restart. it writes to DN3 and DN4. restart Active NameNode. then try to get the file. NameNode return locatedblocks with DN1 and DN2. Can not obtain block Exception occurred. {code} 20/06/20 17:57:06 DEBUG hdfs.DFSClient: newInfo = LocatedBlocks{ fileLength=0 underConstruction=true blocks=[LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; getBlockSize()=53; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]}] lastLocatedBlock=LocatedBlock{BP-1590194288-10.162.26.113-1587096223927:blk_1073895975_155796; getBlockSize()=53; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[DN1:25009,DS-1dcbe5bd-f69a-422c-bea6-a41bda773084,DISK], DatanodeInfoWithStorage[DN2:25009,DS-cd06a4f9-c25d-42ab-887b-f129707dba17,DISK]]} isLastBlockComplete=false} {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15427) Merged ListStatus with Fallback target filesystem and InternalDirViewFS.
[ https://issues.apache.org/jira/browse/HDFS-15427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142757#comment-17142757 ] Hudson commented on HDFS-15427: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18374 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18374/]) HDFS-15427. Merged ListStatus with Fallback target filesystem and (github: rev 7c02d1889bbeabc73c95a4c83f0cd204365ff410) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFs.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFileSystemLinkFallback.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/InodeTree.java > Merged ListStatus with Fallback target filesystem and InternalDirViewFS. > > > Key: HDFS-15427 > URL: https://issues.apache.org/jira/browse/HDFS-15427 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: viewfs >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.4.0 > > > Currently ListStatus will not consider fallback directory when passed path is > an internal Directory(except root). > Since we configured fallback, we should be able to list fallback directories > when passed path is internal directory. It should list the union of > fallbackDir and internalDir. > So, that fallback directories will not be shaded when path matched to > internal dir. > > The idea here is, user configured default filesystem with fallback fs, then > every operation not having link should go to fallback fs. That way users need > not configure all paths as mount from default fs. > > This will be very useful in the case of ViewFSOverloadScheme. > In ViewFSOverloadScheme, if you choose your existing cluster to be configured > as fallback fs, then you can configure desired mount paths to external fs and > rest other path should go to fallback. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15427) Merged ListStatus with Fallback target filesystem and InternalDirViewFS.
[ https://issues.apache.org/jira/browse/HDFS-15427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G resolved HDFS-15427. Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Thanks a lot [~ayushsaxena] for reviews! > Merged ListStatus with Fallback target filesystem and InternalDirViewFS. > > > Key: HDFS-15427 > URL: https://issues.apache.org/jira/browse/HDFS-15427 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: viewfs >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.4.0 > > > Currently ListStatus will not consider fallback directory when passed path is > an internal Directory(except root). > Since we configured fallback, we should be able to list fallback directories > when passed path is internal directory. It should list the union of > fallbackDir and internalDir. > So, that fallback directories will not be shaded when path matched to > internal dir. > > The idea here is, user configured default filesystem with fallback fs, then > every operation not having link should go to fallback fs. That way users need > not configure all paths as mount from default fs. > > This will be very useful in the case of ViewFSOverloadScheme. > In ViewFSOverloadScheme, if you choose your existing cluster to be configured > as fallback fs, then you can configure desired mount paths to external fs and > rest other path should go to fallback. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-15421: - Target Version/s: 3.3.0, 3.1.4, 3.2.2, 2.10.1 (was: 2.10.1) Labels: release-blocker (was: ) I observed the leak in our 3.3.0-SNAPSHOT dev cluster. Adding the target versions. > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Priority: Blocker > Labels: release-blocker > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15430) create should work when parent dir is internalDir and fallback configured.
[ https://issues.apache.org/jira/browse/HDFS-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15430: --- Description: create will not work if the parent dir is Internal mount dir (non leaf in mount path) and fall back configured. Since fallback is available and if same tree structure available in fallback, we should be able to create in fallback fs. > create should work when parent dir is internalDir and fallback configured. > --- > > Key: HDFS-15430 > URL: https://issues.apache.org/jira/browse/HDFS-15430 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > > create will not work if the parent dir is Internal mount dir (non leaf in > mount path) and fall back configured. > Since fallback is available and if same tree structure available in fallback, > we should be able to create in fallback fs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15430) create should work when parent dir is internalDir and fallback configured.
Uma Maheswara Rao G created HDFS-15430: -- Summary: create should work when parent dir is internalDir and fallback configured. Key: HDFS-15430 URL: https://issues.apache.org/jira/browse/HDFS-15430 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.2.1 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15429) mkdirs should work when parent dir is internalDir and fallback configured.
Uma Maheswara Rao G created HDFS-15429: -- Summary: mkdirs should work when parent dir is internalDir and fallback configured. Key: HDFS-15429 URL: https://issues.apache.org/jira/browse/HDFS-15429 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.21 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G mkdir will not work if the parent dir is Internal mount dir (non leaf in mount path) and fall back configured. Since fallback is available and if same tree structure available in fallback, we should be able to mkdir in fallback. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142713#comment-17142713 ] Akira Ajisaka commented on HDFS-15421: -- Thank you [~kihwal] for the detailed report. I read your report and the discussion in HDFS-14941. In append operation, ANN first log {{OP_SET_GENSTAMP_V2}} and then log {{OP_APPEND}}. After HDFS-14941, SNN rolls {{OP_SET_GENSTAMP_V2}} log and set impending genstamp without updating the global genstamp. Next SNN rolls {{OP_APPEND}} log but the global genstamp is not updated. That's why genstamp is never updated and IBR always comes from the future. I think we need to update genstamp when rolling {{OP_APPEND}}. In {{OP_TRUNCATE}}, it is the same. > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Priority: Blocker > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org