[jira] [Created] (HDFS-15454) ViewFsOverloadScheme should not display error message with "viewfs://" even when it's initialized with other fs.

2020-07-04 Thread Uma Maheswara Rao G (Jira)
Uma Maheswara Rao G created HDFS-15454:
--

 Summary: ViewFsOverloadScheme should not display error message 
with "viewfs://" even when it's initialized with other fs.
 Key: HDFS-15454
 URL: https://issues.apache.org/jira/browse/HDFS-15454
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


Currently ViewFsOverloadScheme extended from ViewFileSystem. When there are no 
mount links, fs initialization fails and throws the exception. When it fails, 
even it's initialized via ViewFsOverloadScheme( any scheme can be initialized, 
let's say hdfs://clustername), the exception message always refers to 
"viewfs://..."

{code:java}
java.io.IOException: ViewFs: Cannot initialize: Empty Mount table in config for 
viewfs://clustername/ 
{code}

The message should be like below:
{code:java}
java.io.IOException: ViewFs: Cannot initialize: Empty Mount table in config for 
hdfs://clustername/ 
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15453) Implement ViewFsAdmin to list the mount points, target fs for path etc.

2020-07-04 Thread Uma Maheswara Rao G (Jira)
Uma Maheswara Rao G created HDFS-15453:
--

 Summary: Implement ViewFsAdmin to list the mount points, target fs 
for path etc.
 Key: HDFS-15453
 URL: https://issues.apache.org/jira/browse/HDFS-15453
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: viewfs, viewfsOverloadScheme
Reporter: Uma Maheswara Rao G


I think, it may be a good idea to have some admin commands to list mount points 
and getting target fs type for the given path etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15450) Fix NN trash emptier to work if ViewFSOveroadScheme enabled

2020-07-04 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-15450:
---
Component/s: viewfsOverloadScheme
 namenode

> Fix NN trash emptier to work if ViewFSOveroadScheme enabled
> ---
>
> Key: HDFS-15450
> URL: https://issues.apache.org/jira/browse/HDFS-15450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, viewfsOverloadScheme
>Affects Versions: 3.2.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Fix For: 3.4.0
>
>
> When users add mount links only fs.defautFS, in HA NN, it will initialize 
> trashEmptier with RPC address set to defaultFS. It will fail to start because 
> we might not have configure any mount links with RPC address based URI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15450) Fix NN trash emptier to work if ViewFSOveroadScheme enabled

2020-07-04 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151424#comment-17151424
 ] 

Hudson commented on HDFS-15450:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18408 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18408/])
HDFS-15450. Fix NN trash emptier to work if ViewFSOveroadScheme enabled. 
(github: rev 55a2ae80dc9b45413febd33840b8a653e3e29440)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestNNStartupWhenViewFSOverloadSchemeEnabled.java


> Fix NN trash emptier to work if ViewFSOveroadScheme enabled
> ---
>
> Key: HDFS-15450
> URL: https://issues.apache.org/jira/browse/HDFS-15450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.2.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Fix For: 3.4.0
>
>
> When users add mount links only fs.defautFS, in HA NN, it will initialize 
> trashEmptier with RPC address set to defaultFS. It will fail to start because 
> we might not have configure any mount links with RPC address based URI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15450) Fix NN trash emptier to work in HA mode if ViewFSOveroadScheme enabled

2020-07-04 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-15450:
---
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~surendralilhore] for review!
I have just committed this to trunk.

> Fix NN trash emptier to work in HA mode if ViewFSOveroadScheme enabled
> --
>
> Key: HDFS-15450
> URL: https://issues.apache.org/jira/browse/HDFS-15450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.2.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Fix For: 3.4.0
>
>
> When users add mount links only fs.defautFS, in HA NN, it will initialize 
> trashEmptier with RPC address set to defaultFS. It will fail to start because 
> we might not have configure any mount links with RPC address based URI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15450) Fix NN trash emptier to work if ViewFSOveroadScheme enabled

2020-07-04 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-15450:
---
Summary: Fix NN trash emptier to work if ViewFSOveroadScheme enabled  (was: 
Fix NN trash emptier to work in HA mode if ViewFSOveroadScheme enabled)

> Fix NN trash emptier to work if ViewFSOveroadScheme enabled
> ---
>
> Key: HDFS-15450
> URL: https://issues.apache.org/jira/browse/HDFS-15450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.2.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Fix For: 3.4.0
>
>
> When users add mount links only fs.defautFS, in HA NN, it will initialize 
> trashEmptier with RPC address set to defaultFS. It will fail to start because 
> we might not have configure any mount links with RPC address based URI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15438) Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy

2020-07-04 Thread AMC-team (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151394#comment-17151394
 ] 

AMC-team commented on HDFS-15438:
-

I added a patch to change the while loop condition to adapt that max disk 
errors = 0

> Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy
> --
>
> Key: HDFS-15438
> URL: https://issues.apache.org/jira/browse/HDFS-15438
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Reporter: AMC-team
>Priority: Major
> Attachments: HDFS-15438.000.patch
>
>
> In HDFS disk balancer, the config parameter 
> "dfs.disk.balancer.max.disk.errors" is to control the value of maximum number 
> of errors we can ignore for a specific move between two disks before it is 
> abandoned.
> The parameter can accept value that >= 0. And setting the value to 0 should 
> mean no error tolerance. However, setting the value to 0 will simply don't do 
> the block copy even there is no disk error occur because the while loop 
> condition *item.getErrorCount() < getMaxError(item)* will not satisfied.
> {code:java}
> // Gets the next block that we can copy
> private ExtendedBlock getBlockToCopy(FsVolumeSpi.BlockIterator iter,
>  DiskBalancerWorkItem item) {
>   while (!iter.atEnd() && item.getErrorCount() < getMaxError(item)) {
> try {
>   ... //get the block
> }  catch (IOException e) {
> item.incErrorCount();
> }
>if (item.getErrorCount() >= getMaxError(item)) {
> item.setErrMsg("Error count exceeded.");
> LOG.info("Maximum error count exceeded. Error count: {} Max error:{} 
> ",
> item.getErrorCount(), item.getMaxDiskErrors());
>   }
> {code}
> *How to fix*
> Change the while loop condition to support value 0.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15439) Setting dfs.mover.retry.max.attempts to negative value will retry forever.

2020-07-04 Thread AMC-team (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151393#comment-17151393
 ] 

AMC-team commented on HDFS-15439:
-

I added a patch to change the if condition, thus handling negative value of 
dfs.mover.retry.max.attempts

> Setting dfs.mover.retry.max.attempts to negative value will retry forever.
> --
>
> Key: HDFS-15439
> URL: https://issues.apache.org/jira/browse/HDFS-15439
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Reporter: AMC-team
>Priority: Major
> Attachments: HDFS-15439.000.patch
>
>
> Configuration parameter "dfs.mover.retry.max.attempts" is to define the 
> maximum number of retries before the mover consider the move failed. There is 
> no checking code so this parameter can accept any int value.
> Theoratically, setting this value to <=0 should mean that no retry at all. 
> However, if you set the value to negative value. The checking condition for 
> retry failed will never satisfied because the if statement is "*if 
> (retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by 
> retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* 
> {code:java}
> private Result processNamespace() throws IOException {
>   ... //wait for pending move to finish and retry the failed migration
>   if (hasFailed && !hasSuccess) {
> if (retryCount.get() == retryMaxAttempts) {
>   result.setRetryFailed();
>   LOG.error("Failed to move some block's after "
>   + retryMaxAttempts + " retries.");
>   return result;
> } else {
>   retryCount.incrementAndGet();
> }
>   } else {
> // Reset retry count if no failure.
> retryCount.set(0);
>   }
>   ...
> }
> {code}
> *How to fix*
> Add checking code of "dfs.mover.retry.max.attempts" to accept only 
> non-negative value or change the if statement condition when retry count 
> exceeds max attempts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15439) Setting dfs.mover.retry.max.attempts to negative value will retry forever.

2020-07-04 Thread AMC-team (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMC-team updated HDFS-15439:

Attachment: HDFS-15439.000.patch

> Setting dfs.mover.retry.max.attempts to negative value will retry forever.
> --
>
> Key: HDFS-15439
> URL: https://issues.apache.org/jira/browse/HDFS-15439
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Reporter: AMC-team
>Priority: Major
> Attachments: HDFS-15439.000.patch
>
>
> Configuration parameter "dfs.mover.retry.max.attempts" is to define the 
> maximum number of retries before the mover consider the move failed. There is 
> no checking code so this parameter can accept any int value.
> Theoratically, setting this value to <=0 should mean that no retry at all. 
> However, if you set the value to negative value. The checking condition for 
> retry failed will never satisfied because the if statement is "*if 
> (retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by 
> retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* 
> {code:java}
> private Result processNamespace() throws IOException {
>   ... //wait for pending move to finish and retry the failed migration
>   if (hasFailed && !hasSuccess) {
> if (retryCount.get() == retryMaxAttempts) {
>   result.setRetryFailed();
>   LOG.error("Failed to move some block's after "
>   + retryMaxAttempts + " retries.");
>   return result;
> } else {
>   retryCount.incrementAndGet();
> }
>   } else {
> // Reset retry count if no failure.
> retryCount.set(0);
>   }
>   ...
> }
> {code}
> *How to fix*
> Add checking code of "dfs.mover.retry.max.attempts" to accept only 
> non-negative value or change the if statement condition when retry count 
> exceeds max attempts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15439) Setting dfs.mover.retry.max.attempts to negative value will retry forever.

2020-07-04 Thread AMC-team (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMC-team updated HDFS-15439:

Description: 
Configuration parameter "dfs.mover.retry.max.attempts" is to define the maximum 
number of retries before the mover consider the move failed. There is no 
checking code so this parameter can accept any int value.

Theoratically, setting this value to <=0 should mean that no retry at all. 
However, if you set the value to negative value. The checking condition for 
retry failed will never satisfied because the if statement is "*if 
(retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by 
retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* 
{code:java}
private Result processNamespace() throws IOException {
  ... //wait for pending move to finish and retry the failed migration
  if (hasFailed && !hasSuccess) {
if (retryCount.get() == retryMaxAttempts) {
  result.setRetryFailed();
  LOG.error("Failed to move some block's after "
  + retryMaxAttempts + " retries.");
  return result;
} else {
  retryCount.incrementAndGet();
}
  } else {
// Reset retry count if no failure.
retryCount.set(0);
  }
  ...
}
{code}
*How to fix*

Add checking code of "dfs.mover.retry.max.attempts" to accept only non-negative 
value or change the if statement condition when retry count exceeds max 
attempts.

  was:
Configuration parameter "dfs.mover.retry.max.attempts" is to define the maximum 
number of retries before the mover consider the move failed. There is no 
checking code so this parameter can accept any int value.

Theoratically, setting this value to <=0 should mean that no retry at all. 
However, if you set the value to negative value. The checking condition for 
retry failed will never satisfied because the if statement is "*if 
(retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by 
retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* 
{code:java}
private Result processNamespace() throws IOException {
  ... //wait for pending move to finish and retry the failed migration
  if (hasFailed && !hasSuccess) {
if (retryCount.get() == retryMaxAttempts) { // retryMaxAttempts < 0
  result.setRetryFailed();
  LOG.error("Failed to move some block's after "
  + retryMaxAttempts + " retries.");
  return result;
} else {
  retryCount.incrementAndGet();
}
  } else {
// Reset retry count if no failure.
retryCount.set(0);
  }
  ...
}
{code}
*How to fix*

Add checking code of "dfs.mover.retry.max.attempts" to accept only non-negative 
value or change the if statement condition when retry count exceeds max 
attempts.


> Setting dfs.mover.retry.max.attempts to negative value will retry forever.
> --
>
> Key: HDFS-15439
> URL: https://issues.apache.org/jira/browse/HDFS-15439
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Reporter: AMC-team
>Priority: Major
>
> Configuration parameter "dfs.mover.retry.max.attempts" is to define the 
> maximum number of retries before the mover consider the move failed. There is 
> no checking code so this parameter can accept any int value.
> Theoratically, setting this value to <=0 should mean that no retry at all. 
> However, if you set the value to negative value. The checking condition for 
> retry failed will never satisfied because the if statement is "*if 
> (retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by 
> retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* 
> {code:java}
> private Result processNamespace() throws IOException {
>   ... //wait for pending move to finish and retry the failed migration
>   if (hasFailed && !hasSuccess) {
> if (retryCount.get() == retryMaxAttempts) {
>   result.setRetryFailed();
>   LOG.error("Failed to move some block's after "
>   + retryMaxAttempts + " retries.");
>   return result;
> } else {
>   retryCount.incrementAndGet();
> }
>   } else {
> // Reset retry count if no failure.
> retryCount.set(0);
>   }
>   ...
> }
> {code}
> *How to fix*
> Add checking code of "dfs.mover.retry.max.attempts" to accept only 
> non-negative value or change the if statement condition when retry count 
> exceeds max attempts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, 

[jira] [Updated] (HDFS-15438) Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy

2020-07-04 Thread AMC-team (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMC-team updated HDFS-15438:

Attachment: HDFS-15438.000.patch

> Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy
> --
>
> Key: HDFS-15438
> URL: https://issues.apache.org/jira/browse/HDFS-15438
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Reporter: AMC-team
>Priority: Major
> Attachments: HDFS-15438.000.patch
>
>
> In HDFS disk balancer, the config parameter 
> "dfs.disk.balancer.max.disk.errors" is to control the value of maximum number 
> of errors we can ignore for a specific move between two disks before it is 
> abandoned.
> The parameter can accept value that >= 0. And setting the value to 0 should 
> mean no error tolerance. However, setting the value to 0 will simply don't do 
> the block copy even there is no disk error occur because the while loop 
> condition *item.getErrorCount() < getMaxError(item)* will not satisfied.
> {code:java}
> // Gets the next block that we can copy
> private ExtendedBlock getBlockToCopy(FsVolumeSpi.BlockIterator iter,
>  DiskBalancerWorkItem item) {
>   while (!iter.atEnd() && item.getErrorCount() < getMaxError(item)) {
> try {
>   ... //get the block
> }  catch (IOException e) {
> item.incErrorCount();
> }
>if (item.getErrorCount() >= getMaxError(item)) {
> item.setErrMsg("Error count exceeded.");
> LOG.info("Maximum error count exceeded. Error count: {} Max error:{} 
> ",
> item.getErrorCount(), item.getMaxDiskErrors());
>   }
> {code}
> *How to fix*
> Change the while loop condition to support value 0.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15438) Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy

2020-07-04 Thread AMC-team (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMC-team updated HDFS-15438:

Summary: Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block 
copy  (was: dfs.disk.balancer.max.disk.errors = 0 will fail the block copy)

> Setting dfs.disk.balancer.max.disk.errors = 0 will fail the block copy
> --
>
> Key: HDFS-15438
> URL: https://issues.apache.org/jira/browse/HDFS-15438
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Reporter: AMC-team
>Priority: Major
>
> In HDFS disk balancer, the config parameter 
> "dfs.disk.balancer.max.disk.errors" is to control the value of maximum number 
> of errors we can ignore for a specific move between two disks before it is 
> abandoned.
> The parameter can accept value that >= 0. And setting the value to 0 should 
> mean no error tolerance. However, setting the value to 0 will simply don't do 
> the block copy even there is no disk error occur because the while loop 
> condition *item.getErrorCount() < getMaxError(item)* will not satisfied.
> {code:java}
> // Gets the next block that we can copy
> private ExtendedBlock getBlockToCopy(FsVolumeSpi.BlockIterator iter,
>  DiskBalancerWorkItem item) {
>   while (!iter.atEnd() && item.getErrorCount() < getMaxError(item)) {
> try {
>   ... //get the block
> }  catch (IOException e) {
> item.incErrorCount();
> }
>if (item.getErrorCount() >= getMaxError(item)) {
> item.setErrMsg("Error count exceeded.");
> LOG.info("Maximum error count exceeded. Error count: {} Max error:{} 
> ",
> item.getErrorCount(), item.getMaxDiskErrors());
>   }
> {code}
> *How to fix*
> Change the while loop condition to support value 0.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15443) Setting dfs.datanode.max.transfer.threads to a very small value can cause strange failure.

2020-07-04 Thread AMC-team (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151386#comment-17151386
 ] 

AMC-team commented on HDFS-15443:
-

[~jianghuazhu], [~elgoiri]

I added a simple patch to do the sanity check. I think it may help at least to 
prevent some accidental misconfigurations. 

> Setting dfs.datanode.max.transfer.threads to a very small value can cause 
> strange failure.
> --
>
> Key: HDFS-15443
> URL: https://issues.apache.org/jira/browse/HDFS-15443
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: AMC-team
>Priority: Major
> Attachments: HDFS-15443.000.patch
>
>
> Configuration parameter dfs.datanode.max.transfer.threads is to specify the 
> maximum number of threads to use for transferring data in and out of the DN. 
> This is a vital param that need to tune carefully. 
> {code:java}
> // DataXceiverServer.java
> // Make sure the xceiver count is not exceeded
> intcurXceiverCount = datanode.getXceiverCount();
> if (curXceiverCount > maxXceiverCount) {
> thrownewIOException("Xceiver count " + curXceiverCount
> + " exceeds the limit of concurrent xceivers: "
> + maxXceiverCount);
> }
> {code}
> There are many issues that caused by not setting this param to an appropriate 
> value. However, there is no any check code to restrict the parameter. 
> Although having a hard-and-fast rule is difficult because we need to consider 
> number of cores, main memory etc, *we can prevent users from setting this 
> value to an absolute wrong value by accident.* (e.g. a negative value that 
> totally break the availability of datanode.)
> *How to fix:*
> Add proper check code for the parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15443) Setting dfs.datanode.max.transfer.threads to a very small value can cause strange failure.

2020-07-04 Thread AMC-team (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMC-team updated HDFS-15443:

Attachment: HDFS-15443.000.patch

> Setting dfs.datanode.max.transfer.threads to a very small value can cause 
> strange failure.
> --
>
> Key: HDFS-15443
> URL: https://issues.apache.org/jira/browse/HDFS-15443
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: AMC-team
>Priority: Major
> Attachments: HDFS-15443.000.patch
>
>
> Configuration parameter dfs.datanode.max.transfer.threads is to specify the 
> maximum number of threads to use for transferring data in and out of the DN. 
> This is a vital param that need to tune carefully. 
> {code:java}
> // DataXceiverServer.java
> // Make sure the xceiver count is not exceeded
> intcurXceiverCount = datanode.getXceiverCount();
> if (curXceiverCount > maxXceiverCount) {
> thrownewIOException("Xceiver count " + curXceiverCount
> + " exceeds the limit of concurrent xceivers: "
> + maxXceiverCount);
> }
> {code}
> There are many issues that caused by not setting this param to an appropriate 
> value. However, there is no any check code to restrict the parameter. 
> Although having a hard-and-fast rule is difficult because we need to consider 
> number of cores, main memory etc, *we can prevent users from setting this 
> value to an absolute wrong value by accident.* (e.g. a negative value that 
> totally break the availability of datanode.)
> *How to fix:*
> Add proper check code for the parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15443) Setting dfs.datanode.max.transfer.threads to a very small value can cause strange failure.

2020-07-04 Thread AMC-team (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMC-team updated HDFS-15443:

Attachment: (was: HDFS-15443.000.patch)

> Setting dfs.datanode.max.transfer.threads to a very small value can cause 
> strange failure.
> --
>
> Key: HDFS-15443
> URL: https://issues.apache.org/jira/browse/HDFS-15443
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: AMC-team
>Priority: Major
>
> Configuration parameter dfs.datanode.max.transfer.threads is to specify the 
> maximum number of threads to use for transferring data in and out of the DN. 
> This is a vital param that need to tune carefully. 
> {code:java}
> // DataXceiverServer.java
> // Make sure the xceiver count is not exceeded
> intcurXceiverCount = datanode.getXceiverCount();
> if (curXceiverCount > maxXceiverCount) {
> thrownewIOException("Xceiver count " + curXceiverCount
> + " exceeds the limit of concurrent xceivers: "
> + maxXceiverCount);
> }
> {code}
> There are many issues that caused by not setting this param to an appropriate 
> value. However, there is no any check code to restrict the parameter. 
> Although having a hard-and-fast rule is difficult because we need to consider 
> number of cores, main memory etc, *we can prevent users from setting this 
> value to an absolute wrong value by accident.* (e.g. a negative value that 
> totally break the availability of datanode.)
> *How to fix:*
> Add proper check code for the parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15443) Setting dfs.datanode.max.transfer.threads to a very small value can cause strange failure.

2020-07-04 Thread AMC-team (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMC-team updated HDFS-15443:

Attachment: HDFS-15443.000.patch

> Setting dfs.datanode.max.transfer.threads to a very small value can cause 
> strange failure.
> --
>
> Key: HDFS-15443
> URL: https://issues.apache.org/jira/browse/HDFS-15443
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: AMC-team
>Priority: Major
> Attachments: HDFS-15443.000.patch
>
>
> Configuration parameter dfs.datanode.max.transfer.threads is to specify the 
> maximum number of threads to use for transferring data in and out of the DN. 
> This is a vital param that need to tune carefully. 
> {code:java}
> // DataXceiverServer.java
> // Make sure the xceiver count is not exceeded
> intcurXceiverCount = datanode.getXceiverCount();
> if (curXceiverCount > maxXceiverCount) {
> thrownewIOException("Xceiver count " + curXceiverCount
> + " exceeds the limit of concurrent xceivers: "
> + maxXceiverCount);
> }
> {code}
> There are many issues that caused by not setting this param to an appropriate 
> value. However, there is no any check code to restrict the parameter. 
> Although having a hard-and-fast rule is difficult because we need to consider 
> number of cores, main memory etc, *we can prevent users from setting this 
> value to an absolute wrong value by accident.* (e.g. a negative value that 
> totally break the availability of datanode.)
> *How to fix:*
> Add proper check code for the parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15452) Dynamically initialize the capacity of BlocksMap

2020-07-04 Thread jianghua zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151327#comment-17151327
 ] 

jianghua zhu edited comment on HDFS-15452 at 7/4/20, 1:25 PM:
--

We can use a variable configuration in hdfs-default.xml to optimize this work 
item.
At the same time, we should also add a default constructor to the BlocksMap 
class.
 [~elgoiri] , do you have any other suggestions?


was (Author: jianghuazhu):
We can use a variable configuration in hdfs-default.xml to optimize this work 
item.
[~elgoiri] , do you have any other suggestions?

> Dynamically initialize the capacity of BlocksMap
> 
>
> Key: HDFS-15452
> URL: https://issues.apache.org/jira/browse/HDFS-15452
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.3
>Reporter: jianghua zhu
>Assignee: jianghua zhu
>Priority: Major
>
> The default value for initializing BlocksMap in the BlockManager class is 2. 
> This can be set to a dynamic value.
> BlockManager#BlockManager() {
> ..
> // Compute the map capacity by allocating 2% of total memory
> blocksMap = new BlocksMap(
>  LightWeightGSet.computeCapacity(2.0, "BlocksMap"));
> ..
> }
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15452) Dynamically initialize the capacity of BlocksMap

2020-07-04 Thread jianghua zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151327#comment-17151327
 ] 

jianghua zhu commented on HDFS-15452:
-

We can use a variable configuration in hdfs-default.xml to optimize this work 
item.
[~elgoiri] , do you have any other suggestions?

> Dynamically initialize the capacity of BlocksMap
> 
>
> Key: HDFS-15452
> URL: https://issues.apache.org/jira/browse/HDFS-15452
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.3
>Reporter: jianghua zhu
>Assignee: jianghua zhu
>Priority: Major
>
> The default value for initializing BlocksMap in the BlockManager class is 2. 
> This can be set to a dynamic value.
> BlockManager#BlockManager() {
> ..
> // Compute the map capacity by allocating 2% of total memory
> blocksMap = new BlocksMap(
>  LightWeightGSet.computeCapacity(2.0, "BlocksMap"));
> ..
> }
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15452) Dynamically initialize the capacity of BlocksMap

2020-07-04 Thread jianghua zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jianghua zhu reassigned HDFS-15452:
---

Assignee: jianghua zhu

> Dynamically initialize the capacity of BlocksMap
> 
>
> Key: HDFS-15452
> URL: https://issues.apache.org/jira/browse/HDFS-15452
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.3
>Reporter: jianghua zhu
>Assignee: jianghua zhu
>Priority: Major
>
> The default value for initializing BlocksMap in the BlockManager class is 2. 
> This can be set to a dynamic value.
> BlockManager#BlockManager() {
> ..
> // Compute the map capacity by allocating 2% of total memory
> blocksMap = new BlocksMap(
>  LightWeightGSet.computeCapacity(2.0, "BlocksMap"));
> ..
> }
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15452) Dynamically initialize the capacity of BlocksMap

2020-07-04 Thread jianghua zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jianghua zhu updated HDFS-15452:

Description: 
The default value for initializing BlocksMap in the BlockManager class is 2. 
This can be set to a dynamic value.

BlockManager#BlockManager() {

..

// Compute the map capacity by allocating 2% of total memory
blocksMap = new BlocksMap(
 LightWeightGSet.computeCapacity(2.0, "BlocksMap"));

..

}

 

  was:
The default value for initializing BlocksMap in the BlockManager class is 2. 
This can be set to a dynamic value.

 


> Dynamically initialize the capacity of BlocksMap
> 
>
> Key: HDFS-15452
> URL: https://issues.apache.org/jira/browse/HDFS-15452
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.3
>Reporter: jianghua zhu
>Priority: Major
>
> The default value for initializing BlocksMap in the BlockManager class is 2. 
> This can be set to a dynamic value.
> BlockManager#BlockManager() {
> ..
> // Compute the map capacity by allocating 2% of total memory
> blocksMap = new BlocksMap(
>  LightWeightGSet.computeCapacity(2.0, "BlocksMap"));
> ..
> }
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15452) Dynamically initialize the capacity of BlocksMap

2020-07-04 Thread jianghua zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jianghua zhu updated HDFS-15452:

Description: 
The default value for initializing BlocksMap in the BlockManager class is 2. 
This can be set to a dynamic value.

 

> Dynamically initialize the capacity of BlocksMap
> 
>
> Key: HDFS-15452
> URL: https://issues.apache.org/jira/browse/HDFS-15452
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.3
>Reporter: jianghua zhu
>Priority: Major
>
> The default value for initializing BlocksMap in the BlockManager class is 2. 
> This can be set to a dynamic value.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15452) Dynamically initialize the capacity of BlocksMap

2020-07-04 Thread jianghua zhu (Jira)
jianghua zhu created HDFS-15452:
---

 Summary: Dynamically initialize the capacity of BlocksMap
 Key: HDFS-15452
 URL: https://issues.apache.org/jira/browse/HDFS-15452
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.3
Reporter: jianghua zhu






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15446) CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with error java.io.FileNotFoundException: Directory does not exist: /.reserved/raw/path

2020-07-04 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151223#comment-17151223
 ] 

Hudson commented on HDFS-15446:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18406 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18406/])
HDFS-15446. CreateSnapshotOp fails during edit log loading for (ayushsaxena: 
rev f86f15cf2003a7c74d6a8dffa4c61236bc0a208a)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshot.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java


> CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with 
> error java.io.FileNotFoundException: Directory does not exist: 
> /.reserved/raw/path 
> ---
>
> Key: HDFS-15446
> URL: https://issues.apache.org/jira/browse/HDFS-15446
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Srinivasu Majeti
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: reserved-word, snapshot
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HDFS-15446.001.patch, HDFS-15446.002.patch, 
> HDFS-15446.003.patch
>
>
> After allowing snapshot creation for a path say /app-logs , when we try to 
> create snapshot on 
>  /.reserved/raw/app-logs , its successful with snapshot creation but later 
> when Standby Namenode is restarted and tries to load the edit record 
> OP_CREATE_SNAPSHOT , we see it failing and Standby Namenode shuts down with 
> an exception "ava.io.FileNotFoundException: Directory does not exist: 
> /.reserved/raw/app-logs" .
> Here are the steps to reproduce :
> {code:java}
> # hdfs dfs -ls /.reserved/raw/
> Found 15 items
> drwxrwxrwt   - yarn   hadoop  0 2020-06-29 10:27 
> /.reserved/raw/app-logs
> drwxr-xr-x   - hive   hadoop  0 2020-06-29 10:29 /.reserved/raw/prod
> ++
> [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /app-logs
> Allowing snapshot on /app-logs succeeded
> [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /prod
> Allowing snapshot on /prod succeeded
> ++
> # hdfs lsSnapshottableDir
> drwxrwxrwt 0 yarn hadoop 0 2020-06-29 10:27 1 65536 /app-logs
> drwxr-xr-x 0 hive hadoop 0 2020-06-29 10:29 1 65536 /prod
> ++
> [root@c3230-node2 ~]# hdfs dfs -createSnapshot /.reserved/raw/app-logs testSS
> Created snapshot /.reserved/raw/app-logs/.snapshot/testSS
> {code}
> Exception we see in Standby namenode while loading the snapshot creation edit 
> record.
> {code:java}
> 2020-06-29 10:33:25,488 ERROR namenode.NameNode (NameNode.java:main(1715)) - 
> Failed to start namenode.
> java.io.FileNotFoundException: Directory does not exist: 
> /.reserved/raw/app-logs
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.valueOf(INodeDirectory.java:60)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.getSnapshottableRoot(SnapshotManager.java:259)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.createSnapshot(SnapshotManager.java:307)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:772)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:257)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15446) CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with error java.io.FileNotFoundException: Directory does not exist: /.reserved/raw/path

2020-07-04 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15446:

Fix Version/s: 3.1.5
   3.4.0
   3.3.1
   3.2.2
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with 
> error java.io.FileNotFoundException: Directory does not exist: 
> /.reserved/raw/path 
> ---
>
> Key: HDFS-15446
> URL: https://issues.apache.org/jira/browse/HDFS-15446
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Srinivasu Majeti
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: reserved-word, snapshot
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HDFS-15446.001.patch, HDFS-15446.002.patch, 
> HDFS-15446.003.patch
>
>
> After allowing snapshot creation for a path say /app-logs , when we try to 
> create snapshot on 
>  /.reserved/raw/app-logs , its successful with snapshot creation but later 
> when Standby Namenode is restarted and tries to load the edit record 
> OP_CREATE_SNAPSHOT , we see it failing and Standby Namenode shuts down with 
> an exception "ava.io.FileNotFoundException: Directory does not exist: 
> /.reserved/raw/app-logs" .
> Here are the steps to reproduce :
> {code:java}
> # hdfs dfs -ls /.reserved/raw/
> Found 15 items
> drwxrwxrwt   - yarn   hadoop  0 2020-06-29 10:27 
> /.reserved/raw/app-logs
> drwxr-xr-x   - hive   hadoop  0 2020-06-29 10:29 /.reserved/raw/prod
> ++
> [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /app-logs
> Allowing snapshot on /app-logs succeeded
> [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /prod
> Allowing snapshot on /prod succeeded
> ++
> # hdfs lsSnapshottableDir
> drwxrwxrwt 0 yarn hadoop 0 2020-06-29 10:27 1 65536 /app-logs
> drwxr-xr-x 0 hive hadoop 0 2020-06-29 10:29 1 65536 /prod
> ++
> [root@c3230-node2 ~]# hdfs dfs -createSnapshot /.reserved/raw/app-logs testSS
> Created snapshot /.reserved/raw/app-logs/.snapshot/testSS
> {code}
> Exception we see in Standby namenode while loading the snapshot creation edit 
> record.
> {code:java}
> 2020-06-29 10:33:25,488 ERROR namenode.NameNode (NameNode.java:main(1715)) - 
> Failed to start namenode.
> java.io.FileNotFoundException: Directory does not exist: 
> /.reserved/raw/app-logs
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.valueOf(INodeDirectory.java:60)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.getSnapshottableRoot(SnapshotManager.java:259)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.createSnapshot(SnapshotManager.java:307)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:772)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:257)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15446) CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with error java.io.FileNotFoundException: Directory does not exist: /.reserved/raw/path

2020-07-04 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151215#comment-17151215
 ] 

Ayush Saxena commented on HDFS-15446:
-

Committed to trunk,branch-3.3,3.2,3.1
Thanx [~sodonnell] for the contribution, [~hemanthboyina] for the review and 
[~smajeti] for the report!!!

> CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with 
> error java.io.FileNotFoundException: Directory does not exist: 
> /.reserved/raw/path 
> ---
>
> Key: HDFS-15446
> URL: https://issues.apache.org/jira/browse/HDFS-15446
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Srinivasu Majeti
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: reserved-word, snapshot
> Attachments: HDFS-15446.001.patch, HDFS-15446.002.patch, 
> HDFS-15446.003.patch
>
>
> After allowing snapshot creation for a path say /app-logs , when we try to 
> create snapshot on 
>  /.reserved/raw/app-logs , its successful with snapshot creation but later 
> when Standby Namenode is restarted and tries to load the edit record 
> OP_CREATE_SNAPSHOT , we see it failing and Standby Namenode shuts down with 
> an exception "ava.io.FileNotFoundException: Directory does not exist: 
> /.reserved/raw/app-logs" .
> Here are the steps to reproduce :
> {code:java}
> # hdfs dfs -ls /.reserved/raw/
> Found 15 items
> drwxrwxrwt   - yarn   hadoop  0 2020-06-29 10:27 
> /.reserved/raw/app-logs
> drwxr-xr-x   - hive   hadoop  0 2020-06-29 10:29 /.reserved/raw/prod
> ++
> [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /app-logs
> Allowing snapshot on /app-logs succeeded
> [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /prod
> Allowing snapshot on /prod succeeded
> ++
> # hdfs lsSnapshottableDir
> drwxrwxrwt 0 yarn hadoop 0 2020-06-29 10:27 1 65536 /app-logs
> drwxr-xr-x 0 hive hadoop 0 2020-06-29 10:29 1 65536 /prod
> ++
> [root@c3230-node2 ~]# hdfs dfs -createSnapshot /.reserved/raw/app-logs testSS
> Created snapshot /.reserved/raw/app-logs/.snapshot/testSS
> {code}
> Exception we see in Standby namenode while loading the snapshot creation edit 
> record.
> {code:java}
> 2020-06-29 10:33:25,488 ERROR namenode.NameNode (NameNode.java:main(1715)) - 
> Failed to start namenode.
> java.io.FileNotFoundException: Directory does not exist: 
> /.reserved/raw/app-logs
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.valueOf(INodeDirectory.java:60)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.getSnapshottableRoot(SnapshotManager.java:259)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.createSnapshot(SnapshotManager.java:307)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:772)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:257)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15430) create should work when parent dir is internalDir and fallback configured.

2020-07-04 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151210#comment-17151210
 ] 

Hudson commented on HDFS-15430:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18405 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18405/])
HDFS-15430. create should work when parent dir is internalDir and (github: rev 
1f2a80b5e5024aeb7fb1f8c31b8fdd0fdb88bb66)
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFs.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFsLinkFallback.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFileSystemOverloadSchemeWithHdfsScheme.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFileSystemLinkFallback.java


> create should work when parent dir is internalDir and fallback configured.
> ---
>
> Key: HDFS-15430
> URL: https://issues.apache.org/jira/browse/HDFS-15430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.2.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Fix For: 3.4.0
>
>
> create will not work if the parent dir is Internal mount dir (non leaf in 
> mount path) and fall back configured.
> Since fallback is available and if same tree structure available in fallback, 
> we should be able to create in fallback fs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15430) create should work when parent dir is internalDir and fallback configured.

2020-07-04 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-15430:
---
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~ayushtkn] for reviews!!!
I have just committed this to trunk.

> create should work when parent dir is internalDir and fallback configured.
> ---
>
> Key: HDFS-15430
> URL: https://issues.apache.org/jira/browse/HDFS-15430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.2.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Fix For: 3.4.0
>
>
> create will not work if the parent dir is Internal mount dir (non leaf in 
> mount path) and fall back configured.
> Since fallback is available and if same tree structure available in fallback, 
> we should be able to create in fallback fs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15446) CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with error java.io.FileNotFoundException: Directory does not exist: /.reserved/raw/path

2020-07-04 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151201#comment-17151201
 ] 

Ayush Saxena commented on HDFS-15446:
-

v003 LGTM +1

> CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with 
> error java.io.FileNotFoundException: Directory does not exist: 
> /.reserved/raw/path 
> ---
>
> Key: HDFS-15446
> URL: https://issues.apache.org/jira/browse/HDFS-15446
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Srinivasu Majeti
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: reserved-word, snapshot
> Attachments: HDFS-15446.001.patch, HDFS-15446.002.patch, 
> HDFS-15446.003.patch
>
>
> After allowing snapshot creation for a path say /app-logs , when we try to 
> create snapshot on 
>  /.reserved/raw/app-logs , its successful with snapshot creation but later 
> when Standby Namenode is restarted and tries to load the edit record 
> OP_CREATE_SNAPSHOT , we see it failing and Standby Namenode shuts down with 
> an exception "ava.io.FileNotFoundException: Directory does not exist: 
> /.reserved/raw/app-logs" .
> Here are the steps to reproduce :
> {code:java}
> # hdfs dfs -ls /.reserved/raw/
> Found 15 items
> drwxrwxrwt   - yarn   hadoop  0 2020-06-29 10:27 
> /.reserved/raw/app-logs
> drwxr-xr-x   - hive   hadoop  0 2020-06-29 10:29 /.reserved/raw/prod
> ++
> [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /app-logs
> Allowing snapshot on /app-logs succeeded
> [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /prod
> Allowing snapshot on /prod succeeded
> ++
> # hdfs lsSnapshottableDir
> drwxrwxrwt 0 yarn hadoop 0 2020-06-29 10:27 1 65536 /app-logs
> drwxr-xr-x 0 hive hadoop 0 2020-06-29 10:29 1 65536 /prod
> ++
> [root@c3230-node2 ~]# hdfs dfs -createSnapshot /.reserved/raw/app-logs testSS
> Created snapshot /.reserved/raw/app-logs/.snapshot/testSS
> {code}
> Exception we see in Standby namenode while loading the snapshot creation edit 
> record.
> {code:java}
> 2020-06-29 10:33:25,488 ERROR namenode.NameNode (NameNode.java:main(1715)) - 
> Failed to start namenode.
> java.io.FileNotFoundException: Directory does not exist: 
> /.reserved/raw/app-logs
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.valueOf(INodeDirectory.java:60)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.getSnapshottableRoot(SnapshotManager.java:259)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.createSnapshot(SnapshotManager.java:307)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:772)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:257)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org