[jira] [Commented] (HDFS-15404) ShellCommandFencer should expose info about source

2020-06-18 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140176#comment-17140176
 ] 

Hadoop QA commented on HDFS-15404:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 24m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
27m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
8s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
45s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 
16s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 52s{color} | {color:orange} root: The patch generated 4 new + 23 unchanged - 
0 fixed = 27 total (was 23) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  9m 22s{color} 
| {color:red} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 35s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
54s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}268m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ha.TestFailoverController |
|   | hadoop.ha.TestShellCommandFencer |
|   | hadoop.ha.TestNodeFencer |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
|   | hadoop.hdfs.tools.TestDFSHAAdminMiniCluster |
|   | 

[jira] [Commented] (HDFS-14941) Potential editlog race condition can cause corrupted file

2020-06-18 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140156#comment-17140156
 ] 

Kihwal Lee commented on HDFS-14941:
---

This change causes incremental block report leak. The last set of IBRs from 
append is always considered from future and re-queued. Unless the file is 
appended again, those reports won't get processed. But again the IBRs for the 
latest append will leak.  If this happens during a startup safe mode, the 
standby NN can never leave the safe mode on its own.  I didn't test with 
truncate, but it might happen with truncate too.

It is easy to see the last set of IBRs gettting re-queued after enabling debug 
logging in BlockManager.   We first thought there is something wrong with the 
new safe mode implementation. But found out the baseline number of datanode 
pending messages is growing, which does not happen in 2.8. After enabling debug 
logging, we could see the IBRs for append getting re-queued rather than 
processed.  Revert of this change fixed the issue.

Regarding the original corruption issue you saw, we have seen something very 
similar too. After a failover, it suddenly had missing blocks due to 
corruption. But the corruption reason was recorded as "size mismatch" in our 
case. Of course, the actual data was fine. We haven't seen it happening again 
after the fix, but it is rare anyway.  The main part of the fix we did is,
{code}
@@ -2578,10 +2578,7 @@ private BlockInfo processReportedBlock(
 // If the block is an out-of-date generation stamp or state,
 // but we're the standby, we shouldn't treat it as corrupt,
 // but instead just queue it for later processing.
-// TODO: Pretty confident this should be s/storedBlock/block below,
-// since we should be postponing the info of the reported block, not
-// the stored block. See HDFS-6289 for more context.
-queueReportedBlock(storageInfo, storedBlock, reportedState,
+queueReportedBlock(storageInfo, block, reportedState,
 QUEUE_REASON_CORRUPT_STATE);
   } else {
 toCorrupt.add(c);
{code}

We wanted get more run time before reporting to the community. This is only 
place where wrong size is queued with an IBR in append or truncate, because it 
queues using stored block, not reported one.  I wonder why it was left like 
that all these years, despite the suspicious comment.


> Potential editlog race condition can cause corrupted file
> -
>
> Key: HDFS-14941
> URL: https://issues.apache.org/jira/browse/HDFS-14941
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
>  Labels: ha
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-14941.001.patch, HDFS-14941.002.patch, 
> HDFS-14941.003.patch, HDFS-14941.004.patch, HDFS-14941.005.patch, 
> HDFS-14941.006.patch
>
>
> Recently we encountered an issue that, after a failover, NameNode complains 
> corrupted file/missing blocks. The blocks did recover after full block 
> reports, so the blocks are not actually missing. After further investigation, 
> we believe this is what happened:
> First of all, on SbN, it is possible that it receives block reports before 
> corresponding edit tailing happened. In which case SbN postpones processing 
> the DN block report, handled by the guarding logic below:
> {code:java}
>   if (shouldPostponeBlocksFromFuture &&
>   namesystem.isGenStampInFuture(iblk)) {
> queueReportedBlock(storageInfo, iblk, reportedState,
> QUEUE_REASON_FUTURE_GENSTAMP);
> continue;
>   }
> {code}
> Basically if reported block has a future generation stamp, the DN report gets 
> requeued.
> However, in {{FSNamesystem#storeAllocatedBlock}}, we have the following code:
> {code:java}
>   // allocate new block, record block locations in INode.
>   newBlock = createNewBlock();
>   INodesInPath inodesInPath = INodesInPath.fromINode(pendingFile);
>   saveAllocatedBlock(src, inodesInPath, newBlock, targets);
>   persistNewBlock(src, pendingFile);
>   offset = pendingFile.computeFileSize();
> {code}
> The line
>  {{newBlock = createNewBlock();}}
>  Would log an edit entry {{OP_SET_GENSTAMP_V2}} to bump generation stamp on 
> Standby
>  while the following line
>  {{persistNewBlock(src, pendingFile);}}
>  would log another edit entry {{OP_ADD_BLOCK}} to actually add the block on 
> Standby.
> Then the race condition is that, imagine Standby has just processed 
> {{OP_SET_GENSTAMP_V2}}, but not yet {{OP_ADD_BLOCK}} (if they just happen to 
> be in different setment). Now a block report with new generation stamp comes 
> in.
> Since the genstamp bump has already been 

[jira] [Updated] (HDFS-15416) DataStorage#addStorageLocations() should add more reasonable information verification.

2020-06-18 Thread jianghua zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jianghua zhu updated HDFS-15416:

Summary: DataStorage#addStorageLocations() should add more reasonable 
information verification.  (was: The addStorageLocations() method in the 
DataStorage class is not perfect.)

> DataStorage#addStorageLocations() should add more reasonable information 
> verification.
> --
>
> Key: HDFS-15416
> URL: https://issues.apache.org/jira/browse/HDFS-15416
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 3.1.1
>Reporter: jianghua zhu
>Assignee: jianghua zhu
>Priority: Major
> Attachments: HDFS-15416.patch
>
>
> SuccessLocations content is an array, when the number is 0, do not need to be 
> executed again loadBlockPoolSliceStorage ().
> code : 
> try
> {    
> final List successLocations = loadDataStorage(   datanode, 
> nsInfo,    dataDirs, startOpt, executor);  
> return loadBlockPoolSliceStorage(   datanode, nsInfo,   successLocations, 
> startOpt, executor); }
> finally
> {     executor.shutdown(); }
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15404) ShellCommandFencer should expose info about source

2020-06-18 Thread Chen Liang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140052#comment-17140052
 ] 

Chen Liang commented on HDFS-15404:
---

Upload v002 patch to fix the bug that caused failed tests. The bug is that 
parseArgs should allow cmd only having command, in which case both src and dst 
will execute the same command/script

> ShellCommandFencer should expose info about source
> --
>
> Key: HDFS-15404
> URL: https://issues.apache.org/jira/browse/HDFS-15404
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15404.001.patch, HDFS-15404.002.patch
>
>
> Currently the HA fencing logic in ShellCommandFencer exposes environment 
> variable about only the fencing target. i.e. the $target_* variables as 
> mentioned in this [document 
> page|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html]).
>  
> But here only the fencing target variables are getting exposed. Sometimes it 
> is useful to expose info about the fencing source node. One use case is would 
> allow source and target node to identify themselves separately and run 
> different commands/scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15404) ShellCommandFencer should expose info about source

2020-06-18 Thread Chen Liang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-15404:
--
Attachment: HDFS-15404.002.patch

> ShellCommandFencer should expose info about source
> --
>
> Key: HDFS-15404
> URL: https://issues.apache.org/jira/browse/HDFS-15404
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-15404.001.patch, HDFS-15404.002.patch
>
>
> Currently the HA fencing logic in ShellCommandFencer exposes environment 
> variable about only the fencing target. i.e. the $target_* variables as 
> mentioned in this [document 
> page|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html]).
>  
> But here only the fencing target variables are getting exposed. Sometimes it 
> is useful to expose info about the fencing source node. One use case is would 
> allow source and target node to identify themselves separately and run 
> different commands/scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15378) TestReconstructStripedFile#testErasureCodingWorkerXmitsWeight is failing on trunk

2020-06-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139990#comment-17139990
 ] 

Íñigo Goiri commented on HDFS-15378:


+1 on  [^HDFS-15378.001.patch].

> TestReconstructStripedFile#testErasureCodingWorkerXmitsWeight is failing on 
> trunk
> -
>
> Key: HDFS-15378
> URL: https://issues.apache.org/jira/browse/HDFS-15378
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Priority: Major
> Attachments: HDFS-15378.001.patch
>
>
> [https://builds.apache.org/job/PreCommit-HDFS-Build/29377/#showFailuresLink]
> [https://builds.apache.org/job/PreCommit-HDFS-Build/29368/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14546) Document block placement policies

2020-06-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139991#comment-17139991
 ] 

Íñigo Goiri commented on HDFS-14546:


+1 on  [^HDFS-14546-09.patch].

> Document block placement policies
> -
>
> Key: HDFS-14546
> URL: https://issues.apache.org/jira/browse/HDFS-14546
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Amithsha
>Priority: Major
>  Labels: documentation
> Attachments: HDFS-14546-01.patch, HDFS-14546-02.patch, 
> HDFS-14546-03.patch, HDFS-14546-04.patch, HDFS-14546-05.patch, 
> HDFS-14546-06.patch, HDFS-14546-07.patch, HDFS-14546-08.patch, 
> HDFS-14546-09.patch, HdfsDesign.patch
>
>
> Currently, all the documentation refers to the default block placement policy.
> However, over time there have been new policies:
> * BlockPlacementPolicyRackFaultTolerant (HDFS-7891)
> * BlockPlacementPolicyWithNodeGroup (HDFS-3601)
> * BlockPlacementPolicyWithUpgradeDomain (HDFS-9006)
> We should update the documentation to refer to them explaining their 
> particularities and probably how to setup each one of them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15416) The addStorageLocations() method in the DataStorage class is not perfect.

2020-06-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139988#comment-17139988
 ] 

Íñigo Goiri commented on HDFS-15416:


Please update the title to be a little more specific.

> The addStorageLocations() method in the DataStorage class is not perfect.
> -
>
> Key: HDFS-15416
> URL: https://issues.apache.org/jira/browse/HDFS-15416
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 3.1.1
>Reporter: jianghua zhu
>Assignee: jianghua zhu
>Priority: Major
> Attachments: HDFS-15416.patch
>
>
> SuccessLocations content is an array, when the number is 0, do not need to be 
> executed again loadBlockPoolSliceStorage ().
> code : 
> try
> {    
> final List successLocations = loadDataStorage(   datanode, 
> nsInfo,    dataDirs, startOpt, executor);  
> return loadBlockPoolSliceStorage(   datanode, nsInfo,   successLocations, 
> startOpt, executor); }
> finally
> {     executor.shutdown(); }
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15418) ViewFileSystemOverloadScheme should represent mount links as non symlinks

2020-06-18 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-15418:
---
Status: Patch Available  (was: Open)

> ViewFileSystemOverloadScheme should represent mount links as non symlinks
> -
>
> Key: HDFS-15418
> URL: https://issues.apache.org/jira/browse/HDFS-15418
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>
> Currently ViewFileSystemOverloadScheme uses ViewFileSystem default behavior. 
> ViewFS represents the mount links as symlinks always. Since 
> ViewFSOverloadScheme, we can have any scheme, and that scheme fs does not 
> have symlinks, ViewFs behavior symlinks can confuse.
> So, here I propose to represent mount links as non symlinks in 
> ViewFSOverloadScheme



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15418) ViewFileSystemOverloadScheme should represent mount links as non symlinks

2020-06-18 Thread Uma Maheswara Rao G (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139983#comment-17139983
 ] 

Uma Maheswara Rao G edited comment on HDFS-15418 at 6/18/20, 8:46 PM:
--

Updated PR for review!

By default ViewFileSystem represents mount links as symlinks. Many deployment 
does not really developed symlink aware applications as ListStatus behaves 
little differently when we have symlinks. With OverloadScheme, if application 
build on other fs where no symlinks handling, then application see some 
different behaviors. Ex: HADOOP-17024
 However, HADOOP-17029 was attempted to fix some of that incompatibilities. But 
changing existing behaviors would create incompatibility issues. So, the idea 
is to introduce advanced config to disable as symlink assumption in 
ViewFileSystem#listStatus. Bydefault, it's enabled as symlinks in 
ViewFileSystem. If one wants to disable, please set  
fs.viewfs.mount.links.as.symlinks to false.

In ViewFileSystemOverloadScheme, by default it's false as we tend to work as 
any other HCFS filesystem and many of them might not have symlinks. If one 
wants to to see them same as ViewFileSystem, please set 
fs.viewfs.mount.links.as.symlinks to true. This is an advanced and non 
advertised property.

 CC: [~abhishekd] please check if this works fine in your scenarios as this is 
slightly modified behavior from HADOOP-17029.


was (Author: umamaheswararao):
Updated PR for review!

By default ViewFileSystem represents mount links as symlinks. Many deployment 
does not really developed symlink aware applications as ListStatus behaves 
little differently when we have symlinks. With OverloadScheme, if application 
build on other fs where no symlinks handling, then application see some 
different behaviors. Ex: HADOOP-17024
However, HADOOP-17029 was attempted to fix some of that incompatibilities. But 
changing existing behaviors would create incompatibility issues. So, the idea 
is to introduce advanced config to disable as symlink assumption in 
ViewFileSystem#listStatus. Bydefault, it's enabled as symlinks in 
ViewFileSystem. If one wants to disable, please set  
fs.viewfs.mount.links.as.symlinks to false.

In ViewFileSystemOverloadScheme, by default it's false as we tend to work as 
any other HCFS filesystem and many of them might not have symlinks. If one 
wants to to see them same as ViewFileSystem, please set 
fs.viewfs.mount.links.as.symlinks to true. This is an advanced and non 
advertised property.

 

> ViewFileSystemOverloadScheme should represent mount links as non symlinks
> -
>
> Key: HDFS-15418
> URL: https://issues.apache.org/jira/browse/HDFS-15418
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>
> Currently ViewFileSystemOverloadScheme uses ViewFileSystem default behavior. 
> ViewFS represents the mount links as symlinks always. Since 
> ViewFSOverloadScheme, we can have any scheme, and that scheme fs does not 
> have symlinks, ViewFs behavior symlinks can confuse.
> So, here I propose to represent mount links as non symlinks in 
> ViewFSOverloadScheme



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15418) ViewFileSystemOverloadScheme should represent mount links as non symlinks

2020-06-18 Thread Uma Maheswara Rao G (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139983#comment-17139983
 ] 

Uma Maheswara Rao G commented on HDFS-15418:


Updated PR for review!

By default ViewFileSystem represents mount links as symlinks. Many deployment 
does not really developed symlink aware applications as ListStatus behaves 
little differently when we have symlinks. With OverloadScheme, if application 
build on other fs where no symlinks handling, then application see some 
different behaviors. Ex: HADOOP-17024
However, HADOOP-17029 was attempted to fix some of that incompatibilities. But 
changing existing behaviors would create incompatibility issues. So, the idea 
is to introduce advanced config to disable as symlink assumption in 
ViewFileSystem#listStatus. Bydefault, it's enabled as symlinks in 
ViewFileSystem. If one wants to disable, please set  
fs.viewfs.mount.links.as.symlinks to false.

In ViewFileSystemOverloadScheme, by default it's false as we tend to work as 
any other HCFS filesystem and many of them might not have symlinks. If one 
wants to to see them same as ViewFileSystem, please set 
fs.viewfs.mount.links.as.symlinks to true. This is an advanced and non 
advertised property.

 

> ViewFileSystemOverloadScheme should represent mount links as non symlinks
> -
>
> Key: HDFS-15418
> URL: https://issues.apache.org/jira/browse/HDFS-15418
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>
> Currently ViewFileSystemOverloadScheme uses ViewFileSystem default behavior. 
> ViewFS represents the mount links as symlinks always. Since 
> ViewFSOverloadScheme, we can have any scheme, and that scheme fs does not 
> have symlinks, ViewFs behavior symlinks can confuse.
> So, here I propose to represent mount links as non symlinks in 
> ViewFSOverloadScheme



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-18 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-15419:
---
Summary: RBF: Router should retry communicate with NN when cluster is 
unavailable using configurable time interval  (was: Router should retry 
communicate with NN when cluster is unavailable using configurable time 
interval)

> RBF: Router should retry communicate with NN when cluster is unavailable 
> using configurable time interval
> -
>
> Key: HDFS-15419
> URL: https://issues.apache.org/jira/browse/HDFS-15419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: configuration, hdfs-client, rbf
>Reporter: bhji123
>Priority: Major
>
> When cluster is unavailable, router -> namenode communication will only retry 
> once without any time interval, that is not reasonable.
> For example, in my company, which has several hdfs clusters with more than 
> 1000 nodes, we have encountered this problem. In some cases, the cluster 
> becomes unavailable briefly for about 10 or 30 seconds, at the same time, 
> almost all rpc requests to router failed because router only retry once 
> without time interval.
> It's better for us to enhance the router retry strategy, to retry 
> **communicate with NN using configurable time interval and max retry times.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15410) Add separated config file fedbalance-default.xml for fedbalance tool

2020-06-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139961#comment-17139961
 ] 

Íñigo Goiri commented on HDFS-15410:


I think we should refer to this in the documentation too.

> Add separated config file fedbalance-default.xml for fedbalance tool
> 
>
> Key: HDFS-15410
> URL: https://issues.apache.org/jira/browse/HDFS-15410
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15410.001.patch
>
>
> Add a separated config file named fedbalance-default.xml for fedbalance tool 
> configs. It's like the ditcp-default.xml for distcp tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15417) RBF: Lazy get the datanode report for federation WebHDFS operations

2020-06-18 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-15417:
---
Summary: RBF: Lazy get the datanode report for federation WebHDFS 
operations  (was: Lazy get the datanode report for federation WebHDFS 
operations)

> RBF: Lazy get the datanode report for federation WebHDFS operations
> ---
>
> Key: HDFS-15417
> URL: https://issues.apache.org/jira/browse/HDFS-15417
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation, rbf, webhdfs
>Reporter: Ye Ni
>Assignee: Ye Ni
>Priority: Minor
>
> *Why*
>  For WebHDFS CREATE, OPEN, APPEND and GETFILECHECKSUM operations, router or 
> namenode needs to get the datanodes where the block is located, then redirect 
> the request to one of the datanodes.
> However, this chooseDatanode action in router is much slower than namenode, 
> which directly affects the WebHDFS operations above.
> For namenode WebHDFS, it normally takes tens of milliseconds, while router 
> always takes more than 2 seconds.
> *How*
>  Only get the datanode report when necessary in router. It is a very expense 
> operation where all the time is spent on.
> This is only needed when we want to exclude some datanodes or find a random 
> datanode for CREATE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15417) Lazy get the datanode report for federation WebHDFS operations

2020-06-18 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139554#comment-17139554
 ] 

Chao Sun commented on HDFS-15417:
-

I think this addresses the same issue in HDFS-15014. Internally we were trying 
to use the cached DN reports but those are tied with Router metrics and the 
implementation is kind of messy.

> Lazy get the datanode report for federation WebHDFS operations
> --
>
> Key: HDFS-15417
> URL: https://issues.apache.org/jira/browse/HDFS-15417
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation, rbf, webhdfs
>Reporter: Ye Ni
>Assignee: Ye Ni
>Priority: Minor
>
> *Why*
>  For WebHDFS CREATE, OPEN, APPEND and GETFILECHECKSUM operations, router or 
> namenode needs to get the datanodes where the block is located, then redirect 
> the request to one of the datanodes.
> However, this chooseDatanode action in router is much slower than namenode, 
> which directly affects the WebHDFS operations above.
> For namenode WebHDFS, it normally takes tens of milliseconds, while router 
> always takes more than 2 seconds.
> *How*
>  Only get the datanode report when necessary in router. It is a very expense 
> operation where all the time is spent on.
> This is only needed when we want to exclude some datanodes or find a random 
> datanode for CREATE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13965) hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS encryption is enabled.

2020-06-18 Thread LOKESKUMAR VIJAYAKUMAR (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139552#comment-17139552
 ] 

LOKESKUMAR VIJAYAKUMAR commented on HDFS-13965:
---

Hello Team!
Can anyone please help here?

> hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS 
> encryption is enabled.
> -
>
> Key: HDFS-13965
> URL: https://issues.apache.org/jira/browse/HDFS-13965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, kms
>Affects Versions: 2.7.3, 2.7.7
>Reporter: LOKESKUMAR VIJAYAKUMAR
>Assignee: Kitti Nanasi
>Priority: Major
>
> _We use the *+hadoop.security.kerberos.ticket.cache.path+* setting to provide 
> a custom kerberos cache path for all hadoop operations to be run as specified 
> user. But this setting is not honored when KMS encryption is enabled._
> _The below program to read a file works when KMS encryption is not enabled, 
> but it fails when the KMS encryption is enabled._
> _Looks like *hadoop.security.kerberos.ticket.cache.path* setting is not 
> honored by *createConnection on KMSClientProvider.java.*_
>  
> HadoopTest.java (CLASSPATH needs to be set to compile and run)
>  
> import java.io.InputStream;
> import java.net.URI;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
>  
> public class HadoopTest {
>     public static int runRead(String[] args) throws Exception{
>     if (args.length < 3) {
>     System.err.println("HadoopTest hadoop_file_path 
> hadoop_user kerberos_cache");
>     return 1;
>     }
>     Path inputPath = new Path(args[0]);
>     Configuration conf = new Configuration();
>     URI defaultURI = FileSystem.getDefaultUri(conf);
>     
> conf.set("hadoop.security.kerberos.ticket.cache.path",args[2]);
>     FileSystem fs = 
> FileSystem.newInstance(defaultURI,conf,args[1]);
>     InputStream is = fs.open(inputPath);
>     byte[] buffer = new byte[4096];
>     int nr = is.read(buffer);
>     while (nr != -1)
>     {
>     System.out.write(buffer, 0, nr);
>     nr = is.read(buffer);
>     }
>     return 0;
>     }
>     public static void main( String[] args ) throws Exception {
>     int returnCode = HadoopTest.runRead(args);
>     System.exit(returnCode);
>     }
> }
>  
>  
>  
> [root@lstrost3 testhadoop]# pwd
> /testhadoop
>  
> [root@lstrost3 testhadoop]# ls
> HadoopTest.java
>  
> [root@lstrost3 testhadoop]# export CLASSPATH=`hadoop classpath --glob`:.
>  
> [root@lstrost3 testhadoop]# javac HadoopTest.java
>  
> [root@lstrost3 testhadoop]# java HadoopTest
> HadoopTest  hadoop_file_path  hadoop_user  kerberos_cache
>  
> [root@lstrost3 testhadoop]# java HadoopTest /loki/loki.file loki 
> /tmp/krb5cc_1006
> 18/09/27 23:23:20 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/09/27 23:23:21 WARN shortcircuit.DomainSocketFactory: The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> Exception in thread "main" java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: *{color:#FF}No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt){color}*
>     at 
> {color:#FF}*org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:551)*{color}
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:831)
>     at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388)
>     at 
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1393)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1463)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:333)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:340)
>     at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:786)
>     at HadoopTest.runRead(HadoopTest.java:18)
>     at HadoopTest.main(HadoopTest.java:29)
> Caused 

[jira] [Commented] (HDFS-15420) approx scheduled blocks not reseting over time

2020-06-18 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139468#comment-17139468
 ] 

hemanthboyina commented on HDFS-15420:
--

thanks [~maxmzkr] for providing the report , a quick question are there any 
pending reconstruction requests that are timed out?

> approx scheduled blocks not reseting over time
> --
>
> Key: HDFS-15420
> URL: https://issues.apache.org/jira/browse/HDFS-15420
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: block placement
>Affects Versions: 2.6.0, 3.0.0
> Environment: Our 2.6.0 environment is a 3 node cluster running 
> cdh5.15.0.
> Our 3.0.0 environment is a 4 node cluster running cdh6.3.0.
>Reporter: Max Mizikar
>Priority: Minor
> Attachments: Screenshot from 2020-06-18 09-29-57.png, Screenshot from 
> 2020-06-18 09-31-15.png
>
>
> We have been experiencing large amounts of scheduled blocks that never get 
> cleared out. This is preventing blocks from being placed even when there is 
> plenty of space on the system.
> Here is an example of the block growth over 24 hours on one of our systems 
> running 2.6.0
>  !Screenshot from 2020-06-18 09-29-57.png! 
> Here is an example of the block growth over 24 hours on one of our systems 
> running 3.0.0
>  !Screenshot from 2020-06-18 09-31-15.png! 
> https://issues.apache.org/jira/browse/HDFS-1172 appears to be the main issue 
> we were having on 2.6.0 so the growth has decreased since upgrading to 3.0.0, 
> however, there appears to still be a systemic growth in scheduled blocks over 
> time and our systems will still need to restart the namenode on occasion to 
> reset this count. I have not determined what is causing the leaked blocks in 
> 3.0.0.
> Looking into the issue, I discovered that the intention is for scheduled 
> blocks to slowly go back down to 0 after errors cause blocks to be leaked.
> {code}
>   /** Increment the number of blocks scheduled. */
>   void incrementBlocksScheduled(StorageType t) {
> currApproxBlocksScheduled.add(t, 1);
>   }
>   
>   /** Decrement the number of blocks scheduled. */
>   void decrementBlocksScheduled(StorageType t) {
> if (prevApproxBlocksScheduled.get(t) > 0) {
>   prevApproxBlocksScheduled.subtract(t, 1);
> } else if (currApproxBlocksScheduled.get(t) > 0) {
>   currApproxBlocksScheduled.subtract(t, 1);
> } 
> // its ok if both counters are zero.
>   }
>   
>   /** Adjusts curr and prev number of blocks scheduled every few minutes. */
>   private void rollBlocksScheduled(long now) {
> if (now - lastBlocksScheduledRollTime > BLOCKS_SCHEDULED_ROLL_INTERVAL) {
>   prevApproxBlocksScheduled.set(currApproxBlocksScheduled);
>   currApproxBlocksScheduled.reset();
>   lastBlocksScheduledRollTime = now;
> }
>   }
> {code}
> However, this code does not do what is intended if the system has a constant 
> flow of written blocks. If blocks make it into prevApproxBlocksScheduled, the 
> next scheduled block increments currApproxBlocksScheduled and when it 
> completes, it decrements prevApproxBlocksScheduled preventing the leaked 
> block to be removed from the approx count. So, for errors to be corrected, we 
> have to not write any data for the roll period of 10 minutes. The number of 
> blocks we write per 10 minutes is quite high. This allows the error on the 
> approx counts to grow to very large numbers.
> The comments in the ticket for the original implementation suggest this 
> issues was known. https://issues.apache.org/jira/browse/HADOOP-3707. However, 
> it's not clear to me if the severity of it was known at the time.
> > So if there are some blocks that are not reported back by the datanode, 
> > they will eventually get adjusted (usually 10 min; bit longer if datanode 
> > is continuously receiving blocks).
> The comments suggest it will eventually get cleared out, but in our case, it 
> never gets cleared out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15372) Files in snapshots no longer see attribute provider permissions

2020-06-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139441#comment-17139441
 ] 

Hudson commented on HDFS-15372:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18363 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18363/])
Revert "HDFS-15372. Files in snapshots no longer see attribute provider 
(weichiu: rev edf716a5c3ed7f51c994ec8bcc460445f9bb8ece)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodesInPath.java
HDFS-15372. Files in snapshots no longer see attribute provider (weichiu: rev 
d50e93ce7b6aba235ecc0143fe2c7a0150a3ceae)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodesInPath.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java


> Files in snapshots no longer see attribute provider permissions
> ---
>
> Key: HDFS-15372
> URL: https://issues.apache.org/jira/browse/HDFS-15372
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15372.001.patch, HDFS-15372.002.patch, 
> HDFS-15372.003.patch, HDFS-15372.004.patch, HDFS-15372.005.patch
>
>
> Given a cluster with an authorization provider configured (eg Sentry) and the 
> paths covered by the provider are snapshotable, there was a change in 
> behaviour in how the provider permissions and ACLs are applied to files in 
> snapshots between the 2.x branch and Hadoop 3.0.
> Eg, if we have the snapshotable path /data, which is Sentry managed. The ACLs 
> below are provided by Sentry:
> {code}
> hadoop fs -getfacl -R /data
> # file: /data
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::---
> group:flume:rwx
> user:hive:rwx
> group:hive:rwx
> group:testgroup:rwx
> mask::rwx
> other::--x
> /data/tab1
> {code}
> After taking a snapshot, the files in the snapshot do not see the provider 
> permissions:
> {code}
> hadoop fs -getfacl -R /data/.snapshot
> # file: /data/.snapshot
> # owner: 
> # group: 
> user::rwx
> group::rwx
> other::rwx
> # file: /data/.snapshot/snap1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/.snapshot/snap1/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> {code}
> However pre-Hadoop 3.0 (when the attribute provider etc was extensively 
> refactored) snapshots did get the provider permissions.
> The reason is this code in FSDirectory.java which ultimately calls the 
> attribute provider and passes the path we want permissions for:
> {code}
>   INodeAttributes getAttributes(INodesInPath iip)
>   throws IOException {
> INode node = FSDirectory.resolveLastINode(iip);
> int snapshot = iip.getPathSnapshotId();
> INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot);
> UserGroupInformation ugi = NameNode.getRemoteUser();
> INodeAttributeProvider ap = this.getUserFilteredAttributeProvider(ugi);
> if (ap != null) {
>   // permission checking sends the full components array including the
>   // first empty component for the root.  however file status
>   // related calls are expected to strip out the root component according
>   // to TestINodeAttributeProvider.
>   byte[][] components = iip.getPathComponents();
>   components = Arrays.copyOfRange(components, 1, components.length);
>   nodeAttrs = ap.getAttributes(components, nodeAttrs);
> }
> return nodeAttrs;
>   }
> {code}
> The line:
> {code}
> INode node = FSDirectory.resolveLastINode(iip);
> {code}
> Picks the last resolved Inode and if you then call node.getPathComponents, 
> for a path like '/data/.snapshot/snap1/tab1' it will return /data/tab1. It 
> resolves the snapshot path to its original location, but its still the 
> snapshot inode.
> However the logic passes 'iip.getPathComponents' which returns 
> "/user/.snapshot/snap1/tab" to the provider.
> The pre Hadoop 3.0 code passes the inode directly to 

[jira] [Commented] (HDFS-15420) approx scheduled blocks not reseting over time

2020-06-18 Thread Max Mizikar (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139440#comment-17139440
 ] 

Max Mizikar commented on HDFS-15420:


We are forking and switching the order in which currApprox and prevApprox get 
decremented. We will be decrementing from currApprox first. I don't think this 
is a good solution for everyone. It's much more likely to undercount in a 
functioning system than the current implementation. We are running deployments 
where all nodes have the same disk size and have alerts long before we fill up 
disk and need to worry about scheduled too much.
We have also considered making currApprox and prevApprox a map from block to 
count. We have run this as a test for a bit and it seemed to work somewhat 
well. It's certainly more cpu and memory and requires more synchronization, but 
we have not had it be an issue.

> approx scheduled blocks not reseting over time
> --
>
> Key: HDFS-15420
> URL: https://issues.apache.org/jira/browse/HDFS-15420
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: block placement
>Affects Versions: 2.6.0, 3.0.0
> Environment: Our 2.6.0 environment is a 3 node cluster running 
> cdh5.15.0.
> Our 3.0.0 environment is a 4 node cluster running cdh6.3.0.
>Reporter: Max Mizikar
>Priority: Minor
> Attachments: Screenshot from 2020-06-18 09-29-57.png, Screenshot from 
> 2020-06-18 09-31-15.png
>
>
> We have been experiencing large amounts of scheduled blocks that never get 
> cleared out. This is preventing blocks from being placed even when there is 
> plenty of space on the system.
> Here is an example of the block growth over 24 hours on one of our systems 
> running 2.6.0
>  !Screenshot from 2020-06-18 09-29-57.png! 
> Here is an example of the block growth over 24 hours on one of our systems 
> running 3.0.0
>  !Screenshot from 2020-06-18 09-31-15.png! 
> https://issues.apache.org/jira/browse/HDFS-1172 appears to be the main issue 
> we were having on 2.6.0 so the growth has decreased since upgrading to 3.0.0, 
> however, there appears to still be a systemic growth in scheduled blocks over 
> time and our systems will still need to restart the namenode on occasion to 
> reset this count. I have not determined what is causing the leaked blocks in 
> 3.0.0.
> Looking into the issue, I discovered that the intention is for scheduled 
> blocks to slowly go back down to 0 after errors cause blocks to be leaked.
> {code}
>   /** Increment the number of blocks scheduled. */
>   void incrementBlocksScheduled(StorageType t) {
> currApproxBlocksScheduled.add(t, 1);
>   }
>   
>   /** Decrement the number of blocks scheduled. */
>   void decrementBlocksScheduled(StorageType t) {
> if (prevApproxBlocksScheduled.get(t) > 0) {
>   prevApproxBlocksScheduled.subtract(t, 1);
> } else if (currApproxBlocksScheduled.get(t) > 0) {
>   currApproxBlocksScheduled.subtract(t, 1);
> } 
> // its ok if both counters are zero.
>   }
>   
>   /** Adjusts curr and prev number of blocks scheduled every few minutes. */
>   private void rollBlocksScheduled(long now) {
> if (now - lastBlocksScheduledRollTime > BLOCKS_SCHEDULED_ROLL_INTERVAL) {
>   prevApproxBlocksScheduled.set(currApproxBlocksScheduled);
>   currApproxBlocksScheduled.reset();
>   lastBlocksScheduledRollTime = now;
> }
>   }
> {code}
> However, this code does not do what is intended if the system has a constant 
> flow of written blocks. If blocks make it into prevApproxBlocksScheduled, the 
> next scheduled block increments currApproxBlocksScheduled and when it 
> completes, it decrements prevApproxBlocksScheduled preventing the leaked 
> block to be removed from the approx count. So, for errors to be corrected, we 
> have to not write any data for the roll period of 10 minutes. The number of 
> blocks we write per 10 minutes is quite high. This allows the error on the 
> approx counts to grow to very large numbers.
> The comments in the ticket for the original implementation suggest this 
> issues was known. https://issues.apache.org/jira/browse/HADOOP-3707. However, 
> it's not clear to me if the severity of it was known at the time.
> > So if there are some blocks that are not reported back by the datanode, 
> > they will eventually get adjusted (usually 10 min; bit longer if datanode 
> > is continuously receiving blocks).
> The comments suggest it will eventually get cleared out, but in our case, it 
> never gets cleared out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15420) approx scheduled blocks not reseting over time

2020-06-18 Thread Max Mizikar (Jira)
Max Mizikar created HDFS-15420:
--

 Summary: approx scheduled blocks not reseting over time
 Key: HDFS-15420
 URL: https://issues.apache.org/jira/browse/HDFS-15420
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: block placement
Affects Versions: 3.0.0, 2.6.0
 Environment: Our 2.6.0 environment is a 3 node cluster running 
cdh5.15.0.
Our 3.0.0 environment is a 4 node cluster running cdh6.3.0.
Reporter: Max Mizikar
 Attachments: Screenshot from 2020-06-18 09-29-57.png, Screenshot from 
2020-06-18 09-31-15.png

We have been experiencing large amounts of scheduled blocks that never get 
cleared out. This is preventing blocks from being placed even when there is 
plenty of space on the system.
Here is an example of the block growth over 24 hours on one of our systems 
running 2.6.0
 !Screenshot from 2020-06-18 09-29-57.png! 
Here is an example of the block growth over 24 hours on one of our systems 
running 3.0.0
 !Screenshot from 2020-06-18 09-31-15.png! 
https://issues.apache.org/jira/browse/HDFS-1172 appears to be the main issue we 
were having on 2.6.0 so the growth has decreased since upgrading to 3.0.0, 
however, there appears to still be a systemic growth in scheduled blocks over 
time and our systems will still need to restart the namenode on occasion to 
reset this count. I have not determined what is causing the leaked blocks in 
3.0.0.

Looking into the issue, I discovered that the intention is for scheduled blocks 
to slowly go back down to 0 after errors cause blocks to be leaked.
{code}
  /** Increment the number of blocks scheduled. */
  void incrementBlocksScheduled(StorageType t) {
currApproxBlocksScheduled.add(t, 1);
  }
  
  /** Decrement the number of blocks scheduled. */
  void decrementBlocksScheduled(StorageType t) {
if (prevApproxBlocksScheduled.get(t) > 0) {
  prevApproxBlocksScheduled.subtract(t, 1);
} else if (currApproxBlocksScheduled.get(t) > 0) {
  currApproxBlocksScheduled.subtract(t, 1);
} 
// its ok if both counters are zero.
  }
  
  /** Adjusts curr and prev number of blocks scheduled every few minutes. */
  private void rollBlocksScheduled(long now) {
if (now - lastBlocksScheduledRollTime > BLOCKS_SCHEDULED_ROLL_INTERVAL) {
  prevApproxBlocksScheduled.set(currApproxBlocksScheduled);
  currApproxBlocksScheduled.reset();
  lastBlocksScheduledRollTime = now;
}
  }
{code}

However, this code does not do what is intended if the system has a constant 
flow of written blocks. If blocks make it into prevApproxBlocksScheduled, the 
next scheduled block increments currApproxBlocksScheduled and when it 
completes, it decrements prevApproxBlocksScheduled preventing the leaked block 
to be removed from the approx count. So, for errors to be corrected, we have to 
not write any data for the roll period of 10 minutes. The number of blocks we 
write per 10 minutes is quite high. This allows the error on the approx counts 
to grow to very large numbers.

The comments in the ticket for the original implementation suggest this issues 
was known. https://issues.apache.org/jira/browse/HADOOP-3707. However, it's not 
clear to me if the severity of it was known at the time.
> So if there are some blocks that are not reported back by the datanode, they 
> will eventually get adjusted (usually 10 min; bit longer if datanode is 
> continuously receiving blocks).
The comments suggest it will eventually get cleared out, but in our case, it 
never gets cleared out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15372) Files in snapshots no longer see attribute provider permissions

2020-06-18 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139432#comment-17139432
 ] 

Wei-Chiu Chuang commented on HDFS-15372:


Accidentally committed version 004 instead of the last, 005 version. This is 
now corrected.

> Files in snapshots no longer see attribute provider permissions
> ---
>
> Key: HDFS-15372
> URL: https://issues.apache.org/jira/browse/HDFS-15372
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15372.001.patch, HDFS-15372.002.patch, 
> HDFS-15372.003.patch, HDFS-15372.004.patch, HDFS-15372.005.patch
>
>
> Given a cluster with an authorization provider configured (eg Sentry) and the 
> paths covered by the provider are snapshotable, there was a change in 
> behaviour in how the provider permissions and ACLs are applied to files in 
> snapshots between the 2.x branch and Hadoop 3.0.
> Eg, if we have the snapshotable path /data, which is Sentry managed. The ACLs 
> below are provided by Sentry:
> {code}
> hadoop fs -getfacl -R /data
> # file: /data
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::---
> group:flume:rwx
> user:hive:rwx
> group:hive:rwx
> group:testgroup:rwx
> mask::rwx
> other::--x
> /data/tab1
> {code}
> After taking a snapshot, the files in the snapshot do not see the provider 
> permissions:
> {code}
> hadoop fs -getfacl -R /data/.snapshot
> # file: /data/.snapshot
> # owner: 
> # group: 
> user::rwx
> group::rwx
> other::rwx
> # file: /data/.snapshot/snap1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/.snapshot/snap1/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> {code}
> However pre-Hadoop 3.0 (when the attribute provider etc was extensively 
> refactored) snapshots did get the provider permissions.
> The reason is this code in FSDirectory.java which ultimately calls the 
> attribute provider and passes the path we want permissions for:
> {code}
>   INodeAttributes getAttributes(INodesInPath iip)
>   throws IOException {
> INode node = FSDirectory.resolveLastINode(iip);
> int snapshot = iip.getPathSnapshotId();
> INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot);
> UserGroupInformation ugi = NameNode.getRemoteUser();
> INodeAttributeProvider ap = this.getUserFilteredAttributeProvider(ugi);
> if (ap != null) {
>   // permission checking sends the full components array including the
>   // first empty component for the root.  however file status
>   // related calls are expected to strip out the root component according
>   // to TestINodeAttributeProvider.
>   byte[][] components = iip.getPathComponents();
>   components = Arrays.copyOfRange(components, 1, components.length);
>   nodeAttrs = ap.getAttributes(components, nodeAttrs);
> }
> return nodeAttrs;
>   }
> {code}
> The line:
> {code}
> INode node = FSDirectory.resolveLastINode(iip);
> {code}
> Picks the last resolved Inode and if you then call node.getPathComponents, 
> for a path like '/data/.snapshot/snap1/tab1' it will return /data/tab1. It 
> resolves the snapshot path to its original location, but its still the 
> snapshot inode.
> However the logic passes 'iip.getPathComponents' which returns 
> "/user/.snapshot/snap1/tab" to the provider.
> The pre Hadoop 3.0 code passes the inode directly to the provider, and hence 
> it only ever sees the path as "/user/data/tab1".
> It is debatable which path should be passed to the provider - 
> /user/.snapshot/snap1/tab or /data/tab1 in the case of snapshots. However as 
> the behaviour has changed I feel we should ensure the old behaviour is 
> retained.
> It would also be fairly easy to provide a config switch so the provider gets 
> the full snapshot path or the resolved path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15406) Improve the speed of Datanode Block Scan

2020-06-18 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139379#comment-17139379
 ] 

Stephen O'Donnell commented on HDFS-15406:
--

Committed this to all active 3.x branches with no conflicts. Thanks for the 
contribution [~hemanthboyina]!

> Improve the speed of Datanode Block Scan
> 
>
> Key: HDFS-15406
> URL: https://issues.apache.org/jira/browse/HDFS-15406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HDFS-15406.001.patch, HDFS-15406.002.patch
>
>
> In our customer cluster we have approx 10M blocks in one datanode 
> the Datanode to scans all the blocks , it has taken nearly 5mins
> {code:java}
> 2020-06-10 12:17:06,869 | INFO  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: 
> 11149530, missing metadata files:472, missing block files:472, missing blocks 
> in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
> 2020-06-10 12:17:06,869 | WARN  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | Lock held time above threshold: lock identifier: 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
> lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
> org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
> org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
> org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
> org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>  | InstrumentedLock.java:143 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15406) Improve the speed of Datanode Block Scan

2020-06-18 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-15406:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Improve the speed of Datanode Block Scan
> 
>
> Key: HDFS-15406
> URL: https://issues.apache.org/jira/browse/HDFS-15406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HDFS-15406.001.patch, HDFS-15406.002.patch
>
>
> In our customer cluster we have approx 10M blocks in one datanode 
> the Datanode to scans all the blocks , it has taken nearly 5mins
> {code:java}
> 2020-06-10 12:17:06,869 | INFO  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: 
> 11149530, missing metadata files:472, missing block files:472, missing blocks 
> in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
> 2020-06-10 12:17:06,869 | WARN  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | Lock held time above threshold: lock identifier: 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
> lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
> org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
> org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
> org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
> org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>  | InstrumentedLock.java:143 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15406) Improve the speed of Datanode Block Scan

2020-06-18 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-15406:
-
Fix Version/s: 3.1.5
   3.4.0
   3.3.1
   3.2.2

> Improve the speed of Datanode Block Scan
> 
>
> Key: HDFS-15406
> URL: https://issues.apache.org/jira/browse/HDFS-15406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HDFS-15406.001.patch, HDFS-15406.002.patch
>
>
> In our customer cluster we have approx 10M blocks in one datanode 
> the Datanode to scans all the blocks , it has taken nearly 5mins
> {code:java}
> 2020-06-10 12:17:06,869 | INFO  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: 
> 11149530, missing metadata files:472, missing block files:472, missing blocks 
> in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
> 2020-06-10 12:17:06,869 | WARN  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | Lock held time above threshold: lock identifier: 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
> lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
> org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
> org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
> org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
> org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>  | InstrumentedLock.java:143 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15419) Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-18 Thread Yuxuan Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139351#comment-17139351
 ] 

Yuxuan Wang commented on HDFS-15419:


[~ayushtkn] But router will retry or failover once currently in code. 
I don't know why it is involved by some patch. Should we file a jira to remove 
the logic ?

> Router should retry communicate with NN when cluster is unavailable using 
> configurable time interval
> 
>
> Key: HDFS-15419
> URL: https://issues.apache.org/jira/browse/HDFS-15419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: configuration, hdfs-client, rbf
>Reporter: bhji123
>Priority: Major
>
> When cluster is unavailable, router -> namenode communication will only retry 
> once without any time interval, that is not reasonable.
> For example, in my company, which has several hdfs clusters with more than 
> 1000 nodes, we have encountered this problem. In some cases, the cluster 
> becomes unavailable briefly for about 10 or 30 seconds, at the same time, 
> almost all rpc requests to router failed because router only retry once 
> without time interval.
> It's better for us to enhance the router retry strategy, to retry 
> **communicate with NN using configurable time interval and max retry times.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15406) Improve the speed of Datanode Block Scan

2020-06-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139346#comment-17139346
 ] 

Hudson commented on HDFS-15406:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18362 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18362/])
HDFS-15406. Improve the speed of Datanode Block Scan. Contributed by 
(sodonnell: rev 123777823edc98553fcef61f1913ab6e4cd5aa9a)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java


> Improve the speed of Datanode Block Scan
> 
>
> Key: HDFS-15406
> URL: https://issues.apache.org/jira/browse/HDFS-15406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15406.001.patch, HDFS-15406.002.patch
>
>
> In our customer cluster we have approx 10M blocks in one datanode 
> the Datanode to scans all the blocks , it has taken nearly 5mins
> {code:java}
> 2020-06-10 12:17:06,869 | INFO  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: 
> 11149530, missing metadata files:472, missing block files:472, missing blocks 
> in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
> 2020-06-10 12:17:06,869 | WARN  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | Lock held time above threshold: lock identifier: 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
> lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
> org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
> org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
> org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
> org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>  | InstrumentedLock.java:143 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15417) Lazy get the datanode report for federation WebHDFS operations

2020-06-18 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139299#comment-17139299
 ] 

Ayush Saxena commented on HDFS-15417:
-

It indeed makes sense to minimize the the getDatanodeReport() in whichever ways 
possible. Couldn't check the code but it shall be good if we can restrict it. 
Usually in a big cluster I too have observed getDatanodeReport() is quite 
heavy. If it still bothers much we can think about caching and stuff like that 
as well..

> Lazy get the datanode report for federation WebHDFS operations
> --
>
> Key: HDFS-15417
> URL: https://issues.apache.org/jira/browse/HDFS-15417
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation, rbf, webhdfs
>Reporter: Ye Ni
>Assignee: Ye Ni
>Priority: Minor
>
> *Why*
>  For WebHDFS CREATE, OPEN, APPEND and GETFILECHECKSUM operations, router or 
> namenode needs to get the datanodes where the block is located, then redirect 
> the request to one of the datanodes.
> However, this chooseDatanode action in router is much slower than namenode, 
> which directly affects the WebHDFS operations above.
> For namenode WebHDFS, it normally takes tens of milliseconds, while router 
> always takes more than 2 seconds.
> *How*
>  Only get the datanode report when necessary in router. It is a very expense 
> operation where all the time is spent on.
> This is only needed when we want to exclude some datanodes or find a random 
> datanode for CREATE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15419) Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-18 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139292#comment-17139292
 ] 

Ayush Saxena commented on HDFS-15419:
-

I think this has been somewhere discussed before as well.

Router is just a proxy, It just needs to take the call from the client and 
proxy to the nameservice, and whatever response it gets from the nameservice it 
has to give it back to the actual client.

It is up to the actual client discretion whether it want's to wait/retry or 
not. Holding and retrying a call at Router doesn't seems much apt to me.

The retry logics are already there at the Client side codes, This may lead to 
double retries too, and would be better the client only decides whether he 
needs to try again or not.

> Router should retry communicate with NN when cluster is unavailable using 
> configurable time interval
> 
>
> Key: HDFS-15419
> URL: https://issues.apache.org/jira/browse/HDFS-15419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: configuration, hdfs-client, rbf
>Reporter: bhji123
>Priority: Major
>
> When cluster is unavailable, router -> namenode communication will only retry 
> once without any time interval, that is not reasonable.
> For example, in my company, which has several hdfs clusters with more than 
> 1000 nodes, we have encountered this problem. In some cases, the cluster 
> becomes unavailable briefly for about 10 or 30 seconds, at the same time, 
> almost all rpc requests to router failed because router only retry once 
> without time interval.
> It's better for us to enhance the router retry strategy, to retry 
> **communicate with NN using configurable time interval and max retry times.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15419) Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-18 Thread Yuxuan Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139275#comment-17139275
 ] 

Yuxuan Wang commented on HDFS-15419:


Hi~[~bhji123] 
If router retry more times and longer, but clients' timeout and retry times are 
also here, how it works?

If router retry, but nn is still unavaliabe, and then clients timeout, finally 
clients retry. In this case, what's different between let router retry or let 
clients retry? 

> Router should retry communicate with NN when cluster is unavailable using 
> configurable time interval
> 
>
> Key: HDFS-15419
> URL: https://issues.apache.org/jira/browse/HDFS-15419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: configuration, hdfs-client, rbf
>Reporter: bhji123
>Priority: Major
>
> When cluster is unavailable, router -> namenode communication will only retry 
> once without any time interval, that is not reasonable.
> For example, in my company, which has several hdfs clusters with more than 
> 1000 nodes, we have encountered this problem. In some cases, the cluster 
> becomes unavailable briefly for about 10 or 30 seconds, at the same time, 
> almost all rpc requests to router failed because router only retry once 
> without time interval.
> It's better for us to enhance the router retry strategy, to retry 
> **communicate with NN using configurable time interval and max retry times.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15419) Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-18 Thread Yuxuan Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxuan Wang reassigned HDFS-15419:
--

Assignee: (was: Yuxuan Wang)

> Router should retry communicate with NN when cluster is unavailable using 
> configurable time interval
> 
>
> Key: HDFS-15419
> URL: https://issues.apache.org/jira/browse/HDFS-15419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: configuration, hdfs-client, rbf
>Reporter: bhji123
>Priority: Major
>
> When cluster is unavailable, router -> namenode communication will only retry 
> once without any time interval, that is not reasonable.
> For example, in my company, which has several hdfs clusters with more than 
> 1000 nodes, we have encountered this problem. In some cases, the cluster 
> becomes unavailable briefly for about 10 or 30 seconds, at the same time, 
> almost all rpc requests to router failed because router only retry once 
> without time interval.
> It's better for us to enhance the router retry strategy, to retry 
> **communicate with NN using configurable time interval and max retry times.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15419) Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-18 Thread Yuxuan Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxuan Wang reassigned HDFS-15419:
--

Assignee: Yuxuan Wang

> Router should retry communicate with NN when cluster is unavailable using 
> configurable time interval
> 
>
> Key: HDFS-15419
> URL: https://issues.apache.org/jira/browse/HDFS-15419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: configuration, hdfs-client, rbf
>Reporter: bhji123
>Assignee: Yuxuan Wang
>Priority: Major
>
> When cluster is unavailable, router -> namenode communication will only retry 
> once without any time interval, that is not reasonable.
> For example, in my company, which has several hdfs clusters with more than 
> 1000 nodes, we have encountered this problem. In some cases, the cluster 
> becomes unavailable briefly for about 10 or 30 seconds, at the same time, 
> almost all rpc requests to router failed because router only retry once 
> without time interval.
> It's better for us to enhance the router retry strategy, to retry 
> **communicate with NN using configurable time interval and max retry times.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15410) Add separated config file fedbalance-default.xml for fedbalance tool

2020-06-18 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139216#comment-17139216
 ] 

Hadoop QA commented on HDFS-15410:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
16s{color} | {color:green} hadoop-federation-balance in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29438/artifact/out/Dockerfile
 |
| JIRA Issue | HDFS-15410 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13005930/HDFS-15410.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle xml |
| uname | Linux 22c1f191b89c 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 9cbd76cc775 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
|  

[jira] [Updated] (HDFS-15410) Add separated config file fedbalance-default.xml for fedbalance tool

2020-06-18 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-15410:
---
Attachment: HDFS-15410.001.patch
Status: Patch Available  (was: Open)

> Add separated config file fedbalance-default.xml for fedbalance tool
> 
>
> Key: HDFS-15410
> URL: https://issues.apache.org/jira/browse/HDFS-15410
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15410.001.patch
>
>
> Add a separated config file named fedbalance-default.xml for fedbalance tool 
> configs. It's like the ditcp-default.xml for distcp tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15374) Add documentation for fedbalance tool

2020-06-18 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139148#comment-17139148
 ] 

Jinglun commented on HDFS-15374:


Hi [~linyiqun], thanks your comments ! Upload v02 and the image 
BalanceProcedureScheduler.png.

> Add documentation for fedbalance tool
> -
>
> Key: HDFS-15374
> URL: https://issues.apache.org/jira/browse/HDFS-15374
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: BalanceProcedureScheduler.png, HDFS-15374.001.patch, 
> HDFS-15374.002.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15374) Add documentation for fedbalance tool

2020-06-18 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-15374:
---
Attachment: BalanceProcedureScheduler.png

> Add documentation for fedbalance tool
> -
>
> Key: HDFS-15374
> URL: https://issues.apache.org/jira/browse/HDFS-15374
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: BalanceProcedureScheduler.png, HDFS-15374.001.patch, 
> HDFS-15374.002.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15374) Add documentation for fedbalance tool

2020-06-18 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-15374:
---
Attachment: HDFS-15374.002.patch

> Add documentation for fedbalance tool
> -
>
> Key: HDFS-15374
> URL: https://issues.apache.org/jira/browse/HDFS-15374
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15374.001.patch, HDFS-15374.002.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15419) Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-18 Thread bhji123 (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bhji123 updated HDFS-15419:
---
Description: 
When cluster is unavailable, router -> namenode communication will only retry 
once without any time interval, that is not reasonable.

For example, in my company, which has several hdfs clusters with more than 1000 
nodes, we have encountered this problem. In some cases, the cluster becomes 
unavailable briefly for about 10 or 30 seconds, at the same time, almost all 
rpc requests to router failed because router only retry once without time 
interval.

It's better for us to enhance the router retry strategy, to retry **communicate 
with NN using configurable time interval and max retry times.

 

  was:
When cluster is unavailable, router -> namenode communication will only retry 
once without any time interval, that is not reasonable.

For example, in my company, which has several hdfs clusters with more than 1000 
nodes, we have encountered this problem. In some cases, the cluster becomes 
unavailable briefly for about 10 or 30 seconds, at the same time, almost all 
rpc requests to router failed because router only retry once without time 
interval.

It's better for us to enhance the router retry strategy, to retry with 
configurable time interval and max retry times.




 


> Router should retry communicate with NN when cluster is unavailable using 
> configurable time interval
> 
>
> Key: HDFS-15419
> URL: https://issues.apache.org/jira/browse/HDFS-15419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: configuration, hdfs-client, rbf
>Reporter: bhji123
>Priority: Major
>
> When cluster is unavailable, router -> namenode communication will only retry 
> once without any time interval, that is not reasonable.
> For example, in my company, which has several hdfs clusters with more than 
> 1000 nodes, we have encountered this problem. In some cases, the cluster 
> becomes unavailable briefly for about 10 or 30 seconds, at the same time, 
> almost all rpc requests to router failed because router only retry once 
> without time interval.
> It's better for us to enhance the router retry strategy, to retry 
> **communicate with NN using configurable time interval and max retry times.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15419) Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-18 Thread bhji123 (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bhji123 updated HDFS-15419:
---
Summary: Router should retry communicate with NN when cluster is 
unavailable using configurable time interval  (was: router retry with 
configurable time interval when cluster is unavailable)

> Router should retry communicate with NN when cluster is unavailable using 
> configurable time interval
> 
>
> Key: HDFS-15419
> URL: https://issues.apache.org/jira/browse/HDFS-15419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: configuration, hdfs-client, rbf
>Reporter: bhji123
>Priority: Major
>
> When cluster is unavailable, router -> namenode communication will only retry 
> once without any time interval, that is not reasonable.
> For example, in my company, which has several hdfs clusters with more than 
> 1000 nodes, we have encountered this problem. In some cases, the cluster 
> becomes unavailable briefly for about 10 or 30 seconds, at the same time, 
> almost all rpc requests to router failed because router only retry once 
> without time interval.
> It's better for us to enhance the router retry strategy, to retry with 
> configurable time interval and max retry times.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-15416) The addStorageLocations() method in the DataStorage class is not perfect.

2020-06-18 Thread jianghua zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-15416 started by jianghua zhu.
---
> The addStorageLocations() method in the DataStorage class is not perfect.
> -
>
> Key: HDFS-15416
> URL: https://issues.apache.org/jira/browse/HDFS-15416
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 3.1.1
>Reporter: jianghua zhu
>Assignee: jianghua zhu
>Priority: Major
> Attachments: HDFS-15416.patch
>
>
> SuccessLocations content is an array, when the number is 0, do not need to be 
> executed again loadBlockPoolSliceStorage ().
> code : 
> try
> {    
> final List successLocations = loadDataStorage(   datanode, 
> nsInfo,    dataDirs, startOpt, executor);  
> return loadBlockPoolSliceStorage(   datanode, nsInfo,   successLocations, 
> startOpt, executor); }
> finally
> {     executor.shutdown(); }
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work stopped] (HDFS-15416) The addStorageLocations() method in the DataStorage class is not perfect.

2020-06-18 Thread jianghua zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-15416 stopped by jianghua zhu.
---
> The addStorageLocations() method in the DataStorage class is not perfect.
> -
>
> Key: HDFS-15416
> URL: https://issues.apache.org/jira/browse/HDFS-15416
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 3.1.1
>Reporter: jianghua zhu
>Assignee: jianghua zhu
>Priority: Major
> Attachments: HDFS-15416.patch
>
>
> SuccessLocations content is an array, when the number is 0, do not need to be 
> executed again loadBlockPoolSliceStorage ().
> code : 
> try
> {    
> final List successLocations = loadDataStorage(   datanode, 
> nsInfo,    dataDirs, startOpt, executor);  
> return loadBlockPoolSliceStorage(   datanode, nsInfo,   successLocations, 
> startOpt, executor); }
> finally
> {     executor.shutdown(); }
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-15416) The addStorageLocations() method in the DataStorage class is not perfect.

2020-06-18 Thread jianghua zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-15416 started by jianghua zhu.
---
> The addStorageLocations() method in the DataStorage class is not perfect.
> -
>
> Key: HDFS-15416
> URL: https://issues.apache.org/jira/browse/HDFS-15416
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 3.1.1
>Reporter: jianghua zhu
>Assignee: jianghua zhu
>Priority: Major
> Attachments: HDFS-15416.patch
>
>
> SuccessLocations content is an array, when the number is 0, do not need to be 
> executed again loadBlockPoolSliceStorage ().
> code : 
> try
> {    
> final List successLocations = loadDataStorage(   datanode, 
> nsInfo,    dataDirs, startOpt, executor);  
> return loadBlockPoolSliceStorage(   datanode, nsInfo,   successLocations, 
> startOpt, executor); }
> finally
> {     executor.shutdown(); }
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org