[jira] [Created] (HDFS-15454) ViewFsOverloadScheme should not display error message with "viewfs://" even when it's initialized with other fs.
Uma Maheswara Rao G created HDFS-15454: -- Summary: ViewFsOverloadScheme should not display error message with "viewfs://" even when it's initialized with other fs. Key: HDFS-15454 URL: https://issues.apache.org/jira/browse/HDFS-15454 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Currently ViewFsOverloadScheme extended from ViewFileSystem. When there are no mount links, fs initialization fails and throws the exception. When it fails, even it's initialized via ViewFsOverloadScheme( any scheme can be initialized, let's say hdfs://clustername), the exception message always refers to "viewfs://..." {code:java} java.io.IOException: ViewFs: Cannot initialize: Empty Mount table in config for viewfs://clustername/ {code} The message should be like below: {code:java} java.io.IOException: ViewFs: Cannot initialize: Empty Mount table in config for hdfs://clustername/ {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15453) Implement ViewFsAdmin to list the mount points, target fs for path etc.
Uma Maheswara Rao G created HDFS-15453: -- Summary: Implement ViewFsAdmin to list the mount points, target fs for path etc. Key: HDFS-15453 URL: https://issues.apache.org/jira/browse/HDFS-15453 Project: Hadoop HDFS Issue Type: Sub-task Components: viewfs, viewfsOverloadScheme Reporter: Uma Maheswara Rao G I think, it may be a good idea to have some admin commands to list mount points and getting target fs type for the given path etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15450) Fix NN trash emptier to work if ViewFSOveroadScheme enabled
[ https://issues.apache.org/jira/browse/HDFS-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15450: --- Component/s: viewfsOverloadScheme namenode > Fix NN trash emptier to work if ViewFSOveroadScheme enabled > --- > > Key: HDFS-15450 > URL: https://issues.apache.org/jira/browse/HDFS-15450 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode, viewfsOverloadScheme >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.4.0 > > > When users add mount links only fs.defautFS, in HA NN, it will initialize > trashEmptier with RPC address set to defaultFS. It will fail to start because > we might not have configure any mount links with RPC address based URI. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15450) Fix NN trash emptier to work if ViewFSOveroadScheme enabled
[ https://issues.apache.org/jira/browse/HDFS-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151424#comment-17151424 ] Hudson commented on HDFS-15450: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18408 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18408/]) HDFS-15450. Fix NN trash emptier to work if ViewFSOveroadScheme enabled. (github: rev 55a2ae80dc9b45413febd33840b8a653e3e29440) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * (add) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestNNStartupWhenViewFSOverloadSchemeEnabled.java > Fix NN trash emptier to work if ViewFSOveroadScheme enabled > --- > > Key: HDFS-15450 > URL: https://issues.apache.org/jira/browse/HDFS-15450 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.4.0 > > > When users add mount links only fs.defautFS, in HA NN, it will initialize > trashEmptier with RPC address set to defaultFS. It will fail to start because > we might not have configure any mount links with RPC address based URI. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15450) Fix NN trash emptier to work in HA mode if ViewFSOveroadScheme enabled
[ https://issues.apache.org/jira/browse/HDFS-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15450: --- Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~surendralilhore] for review! I have just committed this to trunk. > Fix NN trash emptier to work in HA mode if ViewFSOveroadScheme enabled > -- > > Key: HDFS-15450 > URL: https://issues.apache.org/jira/browse/HDFS-15450 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.4.0 > > > When users add mount links only fs.defautFS, in HA NN, it will initialize > trashEmptier with RPC address set to defaultFS. It will fail to start because > we might not have configure any mount links with RPC address based URI. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15450) Fix NN trash emptier to work if ViewFSOveroadScheme enabled
[ https://issues.apache.org/jira/browse/HDFS-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15450: --- Summary: Fix NN trash emptier to work if ViewFSOveroadScheme enabled (was: Fix NN trash emptier to work in HA mode if ViewFSOveroadScheme enabled) > Fix NN trash emptier to work if ViewFSOveroadScheme enabled > --- > > Key: HDFS-15450 > URL: https://issues.apache.org/jira/browse/HDFS-15450 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.4.0 > > > When users add mount links only fs.defautFS, in HA NN, it will initialize > trashEmptier with RPC address set to defaultFS. It will fail to start because > we might not have configure any mount links with RPC address based URI. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15438) Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy
[ https://issues.apache.org/jira/browse/HDFS-15438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151394#comment-17151394 ] AMC-team commented on HDFS-15438: - I added a patch to change the while loop condition to adapt that max disk errors = 0 > Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy > -- > > Key: HDFS-15438 > URL: https://issues.apache.org/jira/browse/HDFS-15438 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Reporter: AMC-team >Priority: Major > Attachments: HDFS-15438.000.patch > > > In HDFS disk balancer, the config parameter > "dfs.disk.balancer.max.disk.errors" is to control the value of maximum number > of errors we can ignore for a specific move between two disks before it is > abandoned. > The parameter can accept value that >= 0. And setting the value to 0 should > mean no error tolerance. However, setting the value to 0 will simply don't do > the block copy even there is no disk error occur because the while loop > condition *item.getErrorCount() < getMaxError(item)* will not satisfied. > {code:java} > // Gets the next block that we can copy > private ExtendedBlock getBlockToCopy(FsVolumeSpi.BlockIterator iter, > DiskBalancerWorkItem item) { > while (!iter.atEnd() && item.getErrorCount() < getMaxError(item)) { > try { > ... //get the block > } catch (IOException e) { > item.incErrorCount(); > } >if (item.getErrorCount() >= getMaxError(item)) { > item.setErrMsg("Error count exceeded."); > LOG.info("Maximum error count exceeded. Error count: {} Max error:{} > ", > item.getErrorCount(), item.getMaxDiskErrors()); > } > {code} > *How to fix* > Change the while loop condition to support value 0. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15439) Setting dfs.mover.retry.max.attempts to negative value will retry forever.
[ https://issues.apache.org/jira/browse/HDFS-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151393#comment-17151393 ] AMC-team commented on HDFS-15439: - I added a patch to change the if condition, thus handling negative value of dfs.mover.retry.max.attempts > Setting dfs.mover.retry.max.attempts to negative value will retry forever. > -- > > Key: HDFS-15439 > URL: https://issues.apache.org/jira/browse/HDFS-15439 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Reporter: AMC-team >Priority: Major > Attachments: HDFS-15439.000.patch > > > Configuration parameter "dfs.mover.retry.max.attempts" is to define the > maximum number of retries before the mover consider the move failed. There is > no checking code so this parameter can accept any int value. > Theoratically, setting this value to <=0 should mean that no retry at all. > However, if you set the value to negative value. The checking condition for > retry failed will never satisfied because the if statement is "*if > (retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by > retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* > {code:java} > private Result processNamespace() throws IOException { > ... //wait for pending move to finish and retry the failed migration > if (hasFailed && !hasSuccess) { > if (retryCount.get() == retryMaxAttempts) { > result.setRetryFailed(); > LOG.error("Failed to move some block's after " > + retryMaxAttempts + " retries."); > return result; > } else { > retryCount.incrementAndGet(); > } > } else { > // Reset retry count if no failure. > retryCount.set(0); > } > ... > } > {code} > *How to fix* > Add checking code of "dfs.mover.retry.max.attempts" to accept only > non-negative value or change the if statement condition when retry count > exceeds max attempts. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15439) Setting dfs.mover.retry.max.attempts to negative value will retry forever.
[ https://issues.apache.org/jira/browse/HDFS-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AMC-team updated HDFS-15439: Attachment: HDFS-15439.000.patch > Setting dfs.mover.retry.max.attempts to negative value will retry forever. > -- > > Key: HDFS-15439 > URL: https://issues.apache.org/jira/browse/HDFS-15439 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Reporter: AMC-team >Priority: Major > Attachments: HDFS-15439.000.patch > > > Configuration parameter "dfs.mover.retry.max.attempts" is to define the > maximum number of retries before the mover consider the move failed. There is > no checking code so this parameter can accept any int value. > Theoratically, setting this value to <=0 should mean that no retry at all. > However, if you set the value to negative value. The checking condition for > retry failed will never satisfied because the if statement is "*if > (retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by > retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* > {code:java} > private Result processNamespace() throws IOException { > ... //wait for pending move to finish and retry the failed migration > if (hasFailed && !hasSuccess) { > if (retryCount.get() == retryMaxAttempts) { > result.setRetryFailed(); > LOG.error("Failed to move some block's after " > + retryMaxAttempts + " retries."); > return result; > } else { > retryCount.incrementAndGet(); > } > } else { > // Reset retry count if no failure. > retryCount.set(0); > } > ... > } > {code} > *How to fix* > Add checking code of "dfs.mover.retry.max.attempts" to accept only > non-negative value or change the if statement condition when retry count > exceeds max attempts. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15439) Setting dfs.mover.retry.max.attempts to negative value will retry forever.
[ https://issues.apache.org/jira/browse/HDFS-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AMC-team updated HDFS-15439: Description: Configuration parameter "dfs.mover.retry.max.attempts" is to define the maximum number of retries before the mover consider the move failed. There is no checking code so this parameter can accept any int value. Theoratically, setting this value to <=0 should mean that no retry at all. However, if you set the value to negative value. The checking condition for retry failed will never satisfied because the if statement is "*if (retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* {code:java} private Result processNamespace() throws IOException { ... //wait for pending move to finish and retry the failed migration if (hasFailed && !hasSuccess) { if (retryCount.get() == retryMaxAttempts) { result.setRetryFailed(); LOG.error("Failed to move some block's after " + retryMaxAttempts + " retries."); return result; } else { retryCount.incrementAndGet(); } } else { // Reset retry count if no failure. retryCount.set(0); } ... } {code} *How to fix* Add checking code of "dfs.mover.retry.max.attempts" to accept only non-negative value or change the if statement condition when retry count exceeds max attempts. was: Configuration parameter "dfs.mover.retry.max.attempts" is to define the maximum number of retries before the mover consider the move failed. There is no checking code so this parameter can accept any int value. Theoratically, setting this value to <=0 should mean that no retry at all. However, if you set the value to negative value. The checking condition for retry failed will never satisfied because the if statement is "*if (retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* {code:java} private Result processNamespace() throws IOException { ... //wait for pending move to finish and retry the failed migration if (hasFailed && !hasSuccess) { if (retryCount.get() == retryMaxAttempts) { // retryMaxAttempts < 0 result.setRetryFailed(); LOG.error("Failed to move some block's after " + retryMaxAttempts + " retries."); return result; } else { retryCount.incrementAndGet(); } } else { // Reset retry count if no failure. retryCount.set(0); } ... } {code} *How to fix* Add checking code of "dfs.mover.retry.max.attempts" to accept only non-negative value or change the if statement condition when retry count exceeds max attempts. > Setting dfs.mover.retry.max.attempts to negative value will retry forever. > -- > > Key: HDFS-15439 > URL: https://issues.apache.org/jira/browse/HDFS-15439 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Reporter: AMC-team >Priority: Major > > Configuration parameter "dfs.mover.retry.max.attempts" is to define the > maximum number of retries before the mover consider the move failed. There is > no checking code so this parameter can accept any int value. > Theoratically, setting this value to <=0 should mean that no retry at all. > However, if you set the value to negative value. The checking condition for > retry failed will never satisfied because the if statement is "*if > (retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by > retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* > {code:java} > private Result processNamespace() throws IOException { > ... //wait for pending move to finish and retry the failed migration > if (hasFailed && !hasSuccess) { > if (retryCount.get() == retryMaxAttempts) { > result.setRetryFailed(); > LOG.error("Failed to move some block's after " > + retryMaxAttempts + " retries."); > return result; > } else { > retryCount.incrementAndGet(); > } > } else { > // Reset retry count if no failure. > retryCount.set(0); > } > ... > } > {code} > *How to fix* > Add checking code of "dfs.mover.retry.max.attempts" to accept only > non-negative value or change the if statement condition when retry count > exceeds max attempts. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-
[jira] [Updated] (HDFS-15438) Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy
[ https://issues.apache.org/jira/browse/HDFS-15438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AMC-team updated HDFS-15438: Attachment: HDFS-15438.000.patch > Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy > -- > > Key: HDFS-15438 > URL: https://issues.apache.org/jira/browse/HDFS-15438 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Reporter: AMC-team >Priority: Major > Attachments: HDFS-15438.000.patch > > > In HDFS disk balancer, the config parameter > "dfs.disk.balancer.max.disk.errors" is to control the value of maximum number > of errors we can ignore for a specific move between two disks before it is > abandoned. > The parameter can accept value that >= 0. And setting the value to 0 should > mean no error tolerance. However, setting the value to 0 will simply don't do > the block copy even there is no disk error occur because the while loop > condition *item.getErrorCount() < getMaxError(item)* will not satisfied. > {code:java} > // Gets the next block that we can copy > private ExtendedBlock getBlockToCopy(FsVolumeSpi.BlockIterator iter, > DiskBalancerWorkItem item) { > while (!iter.atEnd() && item.getErrorCount() < getMaxError(item)) { > try { > ... //get the block > } catch (IOException e) { > item.incErrorCount(); > } >if (item.getErrorCount() >= getMaxError(item)) { > item.setErrMsg("Error count exceeded."); > LOG.info("Maximum error count exceeded. Error count: {} Max error:{} > ", > item.getErrorCount(), item.getMaxDiskErrors()); > } > {code} > *How to fix* > Change the while loop condition to support value 0. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15438) Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy
[ https://issues.apache.org/jira/browse/HDFS-15438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AMC-team updated HDFS-15438: Summary: Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy (was: dfs.disk.balancer.max.disk.errors = 0 will fail the block copy) > Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy > -- > > Key: HDFS-15438 > URL: https://issues.apache.org/jira/browse/HDFS-15438 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Reporter: AMC-team >Priority: Major > > In HDFS disk balancer, the config parameter > "dfs.disk.balancer.max.disk.errors" is to control the value of maximum number > of errors we can ignore for a specific move between two disks before it is > abandoned. > The parameter can accept value that >= 0. And setting the value to 0 should > mean no error tolerance. However, setting the value to 0 will simply don't do > the block copy even there is no disk error occur because the while loop > condition *item.getErrorCount() < getMaxError(item)* will not satisfied. > {code:java} > // Gets the next block that we can copy > private ExtendedBlock getBlockToCopy(FsVolumeSpi.BlockIterator iter, > DiskBalancerWorkItem item) { > while (!iter.atEnd() && item.getErrorCount() < getMaxError(item)) { > try { > ... //get the block > } catch (IOException e) { > item.incErrorCount(); > } >if (item.getErrorCount() >= getMaxError(item)) { > item.setErrMsg("Error count exceeded."); > LOG.info("Maximum error count exceeded. Error count: {} Max error:{} > ", > item.getErrorCount(), item.getMaxDiskErrors()); > } > {code} > *How to fix* > Change the while loop condition to support value 0. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15443) Setting dfs.datanode.max.transfer.threads to a very small value can cause strange failure.
[ https://issues.apache.org/jira/browse/HDFS-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151386#comment-17151386 ] AMC-team commented on HDFS-15443: - [~jianghuazhu], [~elgoiri] I added a simple patch to do the sanity check. I think it may help at least to prevent some accidental misconfigurations. > Setting dfs.datanode.max.transfer.threads to a very small value can cause > strange failure. > -- > > Key: HDFS-15443 > URL: https://issues.apache.org/jira/browse/HDFS-15443 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: AMC-team >Priority: Major > Attachments: HDFS-15443.000.patch > > > Configuration parameter dfs.datanode.max.transfer.threads is to specify the > maximum number of threads to use for transferring data in and out of the DN. > This is a vital param that need to tune carefully. > {code:java} > // DataXceiverServer.java > // Make sure the xceiver count is not exceeded > intcurXceiverCount = datanode.getXceiverCount(); > if (curXceiverCount > maxXceiverCount) { > thrownewIOException("Xceiver count " + curXceiverCount > + " exceeds the limit of concurrent xceivers: " > + maxXceiverCount); > } > {code} > There are many issues that caused by not setting this param to an appropriate > value. However, there is no any check code to restrict the parameter. > Although having a hard-and-fast rule is difficult because we need to consider > number of cores, main memory etc, *we can prevent users from setting this > value to an absolute wrong value by accident.* (e.g. a negative value that > totally break the availability of datanode.) > *How to fix:* > Add proper check code for the parameter. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15443) Setting dfs.datanode.max.transfer.threads to a very small value can cause strange failure.
[ https://issues.apache.org/jira/browse/HDFS-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AMC-team updated HDFS-15443: Attachment: HDFS-15443.000.patch > Setting dfs.datanode.max.transfer.threads to a very small value can cause > strange failure. > -- > > Key: HDFS-15443 > URL: https://issues.apache.org/jira/browse/HDFS-15443 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: AMC-team >Priority: Major > Attachments: HDFS-15443.000.patch > > > Configuration parameter dfs.datanode.max.transfer.threads is to specify the > maximum number of threads to use for transferring data in and out of the DN. > This is a vital param that need to tune carefully. > {code:java} > // DataXceiverServer.java > // Make sure the xceiver count is not exceeded > intcurXceiverCount = datanode.getXceiverCount(); > if (curXceiverCount > maxXceiverCount) { > thrownewIOException("Xceiver count " + curXceiverCount > + " exceeds the limit of concurrent xceivers: " > + maxXceiverCount); > } > {code} > There are many issues that caused by not setting this param to an appropriate > value. However, there is no any check code to restrict the parameter. > Although having a hard-and-fast rule is difficult because we need to consider > number of cores, main memory etc, *we can prevent users from setting this > value to an absolute wrong value by accident.* (e.g. a negative value that > totally break the availability of datanode.) > *How to fix:* > Add proper check code for the parameter. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15443) Setting dfs.datanode.max.transfer.threads to a very small value can cause strange failure.
[ https://issues.apache.org/jira/browse/HDFS-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AMC-team updated HDFS-15443: Attachment: (was: HDFS-15443.000.patch) > Setting dfs.datanode.max.transfer.threads to a very small value can cause > strange failure. > -- > > Key: HDFS-15443 > URL: https://issues.apache.org/jira/browse/HDFS-15443 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: AMC-team >Priority: Major > > Configuration parameter dfs.datanode.max.transfer.threads is to specify the > maximum number of threads to use for transferring data in and out of the DN. > This is a vital param that need to tune carefully. > {code:java} > // DataXceiverServer.java > // Make sure the xceiver count is not exceeded > intcurXceiverCount = datanode.getXceiverCount(); > if (curXceiverCount > maxXceiverCount) { > thrownewIOException("Xceiver count " + curXceiverCount > + " exceeds the limit of concurrent xceivers: " > + maxXceiverCount); > } > {code} > There are many issues that caused by not setting this param to an appropriate > value. However, there is no any check code to restrict the parameter. > Although having a hard-and-fast rule is difficult because we need to consider > number of cores, main memory etc, *we can prevent users from setting this > value to an absolute wrong value by accident.* (e.g. a negative value that > totally break the availability of datanode.) > *How to fix:* > Add proper check code for the parameter. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15443) Setting dfs.datanode.max.transfer.threads to a very small value can cause strange failure.
[ https://issues.apache.org/jira/browse/HDFS-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AMC-team updated HDFS-15443: Attachment: HDFS-15443.000.patch > Setting dfs.datanode.max.transfer.threads to a very small value can cause > strange failure. > -- > > Key: HDFS-15443 > URL: https://issues.apache.org/jira/browse/HDFS-15443 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: AMC-team >Priority: Major > Attachments: HDFS-15443.000.patch > > > Configuration parameter dfs.datanode.max.transfer.threads is to specify the > maximum number of threads to use for transferring data in and out of the DN. > This is a vital param that need to tune carefully. > {code:java} > // DataXceiverServer.java > // Make sure the xceiver count is not exceeded > intcurXceiverCount = datanode.getXceiverCount(); > if (curXceiverCount > maxXceiverCount) { > thrownewIOException("Xceiver count " + curXceiverCount > + " exceeds the limit of concurrent xceivers: " > + maxXceiverCount); > } > {code} > There are many issues that caused by not setting this param to an appropriate > value. However, there is no any check code to restrict the parameter. > Although having a hard-and-fast rule is difficult because we need to consider > number of cores, main memory etc, *we can prevent users from setting this > value to an absolute wrong value by accident.* (e.g. a negative value that > totally break the availability of datanode.) > *How to fix:* > Add proper check code for the parameter. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15452) Dynamically initialize the capacity of BlocksMap
[ https://issues.apache.org/jira/browse/HDFS-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151327#comment-17151327 ] jianghua zhu edited comment on HDFS-15452 at 7/4/20, 1:25 PM: -- We can use a variable configuration in hdfs-default.xml to optimize this work item. At the same time, we should also add a default constructor to the BlocksMap class. [~elgoiri] , do you have any other suggestions? was (Author: jianghuazhu): We can use a variable configuration in hdfs-default.xml to optimize this work item. [~elgoiri] , do you have any other suggestions? > Dynamically initialize the capacity of BlocksMap > > > Key: HDFS-15452 > URL: https://issues.apache.org/jira/browse/HDFS-15452 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.3 >Reporter: jianghua zhu >Assignee: jianghua zhu >Priority: Major > > The default value for initializing BlocksMap in the BlockManager class is 2. > This can be set to a dynamic value. > BlockManager#BlockManager() { > .. > // Compute the map capacity by allocating 2% of total memory > blocksMap = new BlocksMap( > LightWeightGSet.computeCapacity(2.0, "BlocksMap")); > .. > } > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15452) Dynamically initialize the capacity of BlocksMap
[ https://issues.apache.org/jira/browse/HDFS-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151327#comment-17151327 ] jianghua zhu commented on HDFS-15452: - We can use a variable configuration in hdfs-default.xml to optimize this work item. [~elgoiri] , do you have any other suggestions? > Dynamically initialize the capacity of BlocksMap > > > Key: HDFS-15452 > URL: https://issues.apache.org/jira/browse/HDFS-15452 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.3 >Reporter: jianghua zhu >Assignee: jianghua zhu >Priority: Major > > The default value for initializing BlocksMap in the BlockManager class is 2. > This can be set to a dynamic value. > BlockManager#BlockManager() { > .. > // Compute the map capacity by allocating 2% of total memory > blocksMap = new BlocksMap( > LightWeightGSet.computeCapacity(2.0, "BlocksMap")); > .. > } > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15452) Dynamically initialize the capacity of BlocksMap
[ https://issues.apache.org/jira/browse/HDFS-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jianghua zhu reassigned HDFS-15452: --- Assignee: jianghua zhu > Dynamically initialize the capacity of BlocksMap > > > Key: HDFS-15452 > URL: https://issues.apache.org/jira/browse/HDFS-15452 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.3 >Reporter: jianghua zhu >Assignee: jianghua zhu >Priority: Major > > The default value for initializing BlocksMap in the BlockManager class is 2. > This can be set to a dynamic value. > BlockManager#BlockManager() { > .. > // Compute the map capacity by allocating 2% of total memory > blocksMap = new BlocksMap( > LightWeightGSet.computeCapacity(2.0, "BlocksMap")); > .. > } > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15452) Dynamically initialize the capacity of BlocksMap
[ https://issues.apache.org/jira/browse/HDFS-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jianghua zhu updated HDFS-15452: Description: The default value for initializing BlocksMap in the BlockManager class is 2. This can be set to a dynamic value. BlockManager#BlockManager() { .. // Compute the map capacity by allocating 2% of total memory blocksMap = new BlocksMap( LightWeightGSet.computeCapacity(2.0, "BlocksMap")); .. } was: The default value for initializing BlocksMap in the BlockManager class is 2. This can be set to a dynamic value. > Dynamically initialize the capacity of BlocksMap > > > Key: HDFS-15452 > URL: https://issues.apache.org/jira/browse/HDFS-15452 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.3 >Reporter: jianghua zhu >Priority: Major > > The default value for initializing BlocksMap in the BlockManager class is 2. > This can be set to a dynamic value. > BlockManager#BlockManager() { > .. > // Compute the map capacity by allocating 2% of total memory > blocksMap = new BlocksMap( > LightWeightGSet.computeCapacity(2.0, "BlocksMap")); > .. > } > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15452) Dynamically initialize the capacity of BlocksMap
[ https://issues.apache.org/jira/browse/HDFS-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jianghua zhu updated HDFS-15452: Description: The default value for initializing BlocksMap in the BlockManager class is 2. This can be set to a dynamic value. > Dynamically initialize the capacity of BlocksMap > > > Key: HDFS-15452 > URL: https://issues.apache.org/jira/browse/HDFS-15452 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.3 >Reporter: jianghua zhu >Priority: Major > > The default value for initializing BlocksMap in the BlockManager class is 2. > This can be set to a dynamic value. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15452) Dynamically initialize the capacity of BlocksMap
jianghua zhu created HDFS-15452: --- Summary: Dynamically initialize the capacity of BlocksMap Key: HDFS-15452 URL: https://issues.apache.org/jira/browse/HDFS-15452 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.3 Reporter: jianghua zhu -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15446) CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with error java.io.FileNotFoundException: Directory does not exist: /.reserved/raw/path
[ https://issues.apache.org/jira/browse/HDFS-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151223#comment-17151223 ] Hudson commented on HDFS-15446: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18406 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18406/]) HDFS-15446. CreateSnapshotOp fails during edit log loading for (ayushsaxena: rev f86f15cf2003a7c74d6a8dffa4c61236bc0a208a) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshot.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java > CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with > error java.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/path > --- > > Key: HDFS-15446 > URL: https://issues.apache.org/jira/browse/HDFS-15446 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.2.0, 3.3.0 >Reporter: Srinivasu Majeti >Assignee: Stephen O'Donnell >Priority: Major > Labels: reserved-word, snapshot > Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5 > > Attachments: HDFS-15446.001.patch, HDFS-15446.002.patch, > HDFS-15446.003.patch > > > After allowing snapshot creation for a path say /app-logs , when we try to > create snapshot on > /.reserved/raw/app-logs , its successful with snapshot creation but later > when Standby Namenode is restarted and tries to load the edit record > OP_CREATE_SNAPSHOT , we see it failing and Standby Namenode shuts down with > an exception "ava.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/app-logs" . > Here are the steps to reproduce : > {code:java} > # hdfs dfs -ls /.reserved/raw/ > Found 15 items > drwxrwxrwt - yarn hadoop 0 2020-06-29 10:27 > /.reserved/raw/app-logs > drwxr-xr-x - hive hadoop 0 2020-06-29 10:29 /.reserved/raw/prod > ++ > [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /app-logs > Allowing snapshot on /app-logs succeeded > [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /prod > Allowing snapshot on /prod succeeded > ++ > # hdfs lsSnapshottableDir > drwxrwxrwt 0 yarn hadoop 0 2020-06-29 10:27 1 65536 /app-logs > drwxr-xr-x 0 hive hadoop 0 2020-06-29 10:29 1 65536 /prod > ++ > [root@c3230-node2 ~]# hdfs dfs -createSnapshot /.reserved/raw/app-logs testSS > Created snapshot /.reserved/raw/app-logs/.snapshot/testSS > {code} > Exception we see in Standby namenode while loading the snapshot creation edit > record. > {code:java} > 2020-06-29 10:33:25,488 ERROR namenode.NameNode (NameNode.java:main(1715)) - > Failed to start namenode. > java.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/app-logs > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.valueOf(INodeDirectory.java:60) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.getSnapshottableRoot(SnapshotManager.java:259) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.createSnapshot(SnapshotManager.java:307) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:772) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:257) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15446) CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with error java.io.FileNotFoundException: Directory does not exist: /.reserved/raw/path
[ https://issues.apache.org/jira/browse/HDFS-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-15446: Fix Version/s: 3.1.5 3.4.0 3.3.1 3.2.2 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with > error java.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/path > --- > > Key: HDFS-15446 > URL: https://issues.apache.org/jira/browse/HDFS-15446 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.2.0, 3.3.0 >Reporter: Srinivasu Majeti >Assignee: Stephen O'Donnell >Priority: Major > Labels: reserved-word, snapshot > Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5 > > Attachments: HDFS-15446.001.patch, HDFS-15446.002.patch, > HDFS-15446.003.patch > > > After allowing snapshot creation for a path say /app-logs , when we try to > create snapshot on > /.reserved/raw/app-logs , its successful with snapshot creation but later > when Standby Namenode is restarted and tries to load the edit record > OP_CREATE_SNAPSHOT , we see it failing and Standby Namenode shuts down with > an exception "ava.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/app-logs" . > Here are the steps to reproduce : > {code:java} > # hdfs dfs -ls /.reserved/raw/ > Found 15 items > drwxrwxrwt - yarn hadoop 0 2020-06-29 10:27 > /.reserved/raw/app-logs > drwxr-xr-x - hive hadoop 0 2020-06-29 10:29 /.reserved/raw/prod > ++ > [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /app-logs > Allowing snapshot on /app-logs succeeded > [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /prod > Allowing snapshot on /prod succeeded > ++ > # hdfs lsSnapshottableDir > drwxrwxrwt 0 yarn hadoop 0 2020-06-29 10:27 1 65536 /app-logs > drwxr-xr-x 0 hive hadoop 0 2020-06-29 10:29 1 65536 /prod > ++ > [root@c3230-node2 ~]# hdfs dfs -createSnapshot /.reserved/raw/app-logs testSS > Created snapshot /.reserved/raw/app-logs/.snapshot/testSS > {code} > Exception we see in Standby namenode while loading the snapshot creation edit > record. > {code:java} > 2020-06-29 10:33:25,488 ERROR namenode.NameNode (NameNode.java:main(1715)) - > Failed to start namenode. > java.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/app-logs > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.valueOf(INodeDirectory.java:60) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.getSnapshottableRoot(SnapshotManager.java:259) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.createSnapshot(SnapshotManager.java:307) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:772) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:257) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15446) CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with error java.io.FileNotFoundException: Directory does not exist: /.reserved/raw/path
[ https://issues.apache.org/jira/browse/HDFS-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151215#comment-17151215 ] Ayush Saxena commented on HDFS-15446: - Committed to trunk,branch-3.3,3.2,3.1 Thanx [~sodonnell] for the contribution, [~hemanthboyina] for the review and [~smajeti] for the report!!! > CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with > error java.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/path > --- > > Key: HDFS-15446 > URL: https://issues.apache.org/jira/browse/HDFS-15446 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.2.0, 3.3.0 >Reporter: Srinivasu Majeti >Assignee: Stephen O'Donnell >Priority: Major > Labels: reserved-word, snapshot > Attachments: HDFS-15446.001.patch, HDFS-15446.002.patch, > HDFS-15446.003.patch > > > After allowing snapshot creation for a path say /app-logs , when we try to > create snapshot on > /.reserved/raw/app-logs , its successful with snapshot creation but later > when Standby Namenode is restarted and tries to load the edit record > OP_CREATE_SNAPSHOT , we see it failing and Standby Namenode shuts down with > an exception "ava.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/app-logs" . > Here are the steps to reproduce : > {code:java} > # hdfs dfs -ls /.reserved/raw/ > Found 15 items > drwxrwxrwt - yarn hadoop 0 2020-06-29 10:27 > /.reserved/raw/app-logs > drwxr-xr-x - hive hadoop 0 2020-06-29 10:29 /.reserved/raw/prod > ++ > [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /app-logs > Allowing snapshot on /app-logs succeeded > [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /prod > Allowing snapshot on /prod succeeded > ++ > # hdfs lsSnapshottableDir > drwxrwxrwt 0 yarn hadoop 0 2020-06-29 10:27 1 65536 /app-logs > drwxr-xr-x 0 hive hadoop 0 2020-06-29 10:29 1 65536 /prod > ++ > [root@c3230-node2 ~]# hdfs dfs -createSnapshot /.reserved/raw/app-logs testSS > Created snapshot /.reserved/raw/app-logs/.snapshot/testSS > {code} > Exception we see in Standby namenode while loading the snapshot creation edit > record. > {code:java} > 2020-06-29 10:33:25,488 ERROR namenode.NameNode (NameNode.java:main(1715)) - > Failed to start namenode. > java.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/app-logs > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.valueOf(INodeDirectory.java:60) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.getSnapshottableRoot(SnapshotManager.java:259) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.createSnapshot(SnapshotManager.java:307) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:772) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:257) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15430) create should work when parent dir is internalDir and fallback configured.
[ https://issues.apache.org/jira/browse/HDFS-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151210#comment-17151210 ] Hudson commented on HDFS-15430: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18405 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18405/]) HDFS-15430. create should work when parent dir is internalDir and (github: rev 1f2a80b5e5024aeb7fb1f8c31b8fdd0fdb88bb66) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFs.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFsLinkFallback.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFileSystemOverloadSchemeWithHdfsScheme.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFileSystemLinkFallback.java > create should work when parent dir is internalDir and fallback configured. > --- > > Key: HDFS-15430 > URL: https://issues.apache.org/jira/browse/HDFS-15430 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.4.0 > > > create will not work if the parent dir is Internal mount dir (non leaf in > mount path) and fall back configured. > Since fallback is available and if same tree structure available in fallback, > we should be able to create in fallback fs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15430) create should work when parent dir is internalDir and fallback configured.
[ https://issues.apache.org/jira/browse/HDFS-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15430: --- Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~ayushtkn] for reviews!!! I have just committed this to trunk. > create should work when parent dir is internalDir and fallback configured. > --- > > Key: HDFS-15430 > URL: https://issues.apache.org/jira/browse/HDFS-15430 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.2.1 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Fix For: 3.4.0 > > > create will not work if the parent dir is Internal mount dir (non leaf in > mount path) and fall back configured. > Since fallback is available and if same tree structure available in fallback, > we should be able to create in fallback fs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15446) CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with error java.io.FileNotFoundException: Directory does not exist: /.reserved/raw/path
[ https://issues.apache.org/jira/browse/HDFS-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151201#comment-17151201 ] Ayush Saxena commented on HDFS-15446: - v003 LGTM +1 > CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with > error java.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/path > --- > > Key: HDFS-15446 > URL: https://issues.apache.org/jira/browse/HDFS-15446 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.2.0, 3.3.0 >Reporter: Srinivasu Majeti >Assignee: Stephen O'Donnell >Priority: Major > Labels: reserved-word, snapshot > Attachments: HDFS-15446.001.patch, HDFS-15446.002.patch, > HDFS-15446.003.patch > > > After allowing snapshot creation for a path say /app-logs , when we try to > create snapshot on > /.reserved/raw/app-logs , its successful with snapshot creation but later > when Standby Namenode is restarted and tries to load the edit record > OP_CREATE_SNAPSHOT , we see it failing and Standby Namenode shuts down with > an exception "ava.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/app-logs" . > Here are the steps to reproduce : > {code:java} > # hdfs dfs -ls /.reserved/raw/ > Found 15 items > drwxrwxrwt - yarn hadoop 0 2020-06-29 10:27 > /.reserved/raw/app-logs > drwxr-xr-x - hive hadoop 0 2020-06-29 10:29 /.reserved/raw/prod > ++ > [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /app-logs > Allowing snapshot on /app-logs succeeded > [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /prod > Allowing snapshot on /prod succeeded > ++ > # hdfs lsSnapshottableDir > drwxrwxrwt 0 yarn hadoop 0 2020-06-29 10:27 1 65536 /app-logs > drwxr-xr-x 0 hive hadoop 0 2020-06-29 10:29 1 65536 /prod > ++ > [root@c3230-node2 ~]# hdfs dfs -createSnapshot /.reserved/raw/app-logs testSS > Created snapshot /.reserved/raw/app-logs/.snapshot/testSS > {code} > Exception we see in Standby namenode while loading the snapshot creation edit > record. > {code:java} > 2020-06-29 10:33:25,488 ERROR namenode.NameNode (NameNode.java:main(1715)) - > Failed to start namenode. > java.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/app-logs > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.valueOf(INodeDirectory.java:60) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.getSnapshottableRoot(SnapshotManager.java:259) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.createSnapshot(SnapshotManager.java:307) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:772) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:257) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org