[jira] [Commented] (HDFS-13274) RBF: Extend RouterRpcClient to use multiple sockets

2020-05-28 Thread Janus Chow (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119228#comment-17119228
 ] 

Janus Chow commented on HDFS-13274:
---

It is a big federation cluster, there are many users querying different 
namespaces, so I think the RpcServerNumOpenConnections is normal.

> RBF: Extend RouterRpcClient to use multiple sockets
> ---
>
> Key: HDFS-13274
> URL: https://issues.apache.org/jira/browse/HDFS-13274
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>
> HADOOP-13144 introduces the ability to create multiple connections for the 
> same user and use different sockets. The RouterRpcClient should use this 
> approach to get a better throughput.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15377) BlockScanner scans one part per round, expect full scans after several rounds

2020-05-28 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119229#comment-17119229
 ] 

Yang Yun commented on HDFS-15377:
-

Updated to HDFS-15377.004.patch for compile error.

> BlockScanner scans one part per round, expect full scans after several rounds
> -
>
> Key: HDFS-15377
> URL: https://issues.apache.org/jira/browse/HDFS-15377
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15377.002.patch, HDFS-15377.003.patch, 
> HDFS-15377.004.patch
>
>
> For reducing disk IO, one block is separated to multiple parts, BlockScanner 
> scans only one part per round. Expect that after several rounds, the full 
> block should be scanned
> Add a new option "dfs.block.scanner.part.size". the maximum data size per 
> scan by the block scanner. this value should be the multiple of chunk size, 
> for example, 512, 1024, 4096 ...
>  Default value is -1, will disable partial scan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15377) BlockScanner scans one part per round, expect full scans after several rounds

2020-05-28 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15377:

Attachment: HDFS-15377.004.patch
Status: Patch Available  (was: Open)

> BlockScanner scans one part per round, expect full scans after several rounds
> -
>
> Key: HDFS-15377
> URL: https://issues.apache.org/jira/browse/HDFS-15377
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15377.002.patch, HDFS-15377.003.patch, 
> HDFS-15377.004.patch
>
>
> For reducing disk IO, one block is separated to multiple parts, BlockScanner 
> scans only one part per round. Expect that after several rounds, the full 
> block should be scanned
> Add a new option "dfs.block.scanner.part.size". the maximum data size per 
> scan by the block scanner. this value should be the multiple of chunk size, 
> for example, 512, 1024, 4096 ...
>  Default value is -1, will disable partial scan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15377) BlockScanner scans one part per round, expect full scans after several rounds

2020-05-28 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15377:

Status: Open  (was: Patch Available)

> BlockScanner scans one part per round, expect full scans after several rounds
> -
>
> Key: HDFS-15377
> URL: https://issues.apache.org/jira/browse/HDFS-15377
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15377.002.patch, HDFS-15377.003.patch
>
>
> For reducing disk IO, one block is separated to multiple parts, BlockScanner 
> scans only one part per round. Expect that after several rounds, the full 
> block should be scanned
> Add a new option "dfs.block.scanner.part.size". the maximum data size per 
> scan by the block scanner. this value should be the multiple of chunk size, 
> for example, 512, 1024, 4096 ...
>  Default value is -1, will disable partial scan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15377) BlockScanner scans one part per round, expect full scans after several rounds

2020-05-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119225#comment-17119225
 ] 

Hadoop QA commented on HDFS-15377:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HDFS-15377 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-15377 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13004298/HDFS-15377.003.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29382/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |


This message was automatically generated.



> BlockScanner scans one part per round, expect full scans after several rounds
> -
>
> Key: HDFS-15377
> URL: https://issues.apache.org/jira/browse/HDFS-15377
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15377.002.patch, HDFS-15377.003.patch
>
>
> For reducing disk IO, one block is separated to multiple parts, BlockScanner 
> scans only one part per round. Expect that after several rounds, the full 
> block should be scanned
> Add a new option "dfs.block.scanner.part.size". the maximum data size per 
> scan by the block scanner. this value should be the multiple of chunk size, 
> for example, 512, 1024, 4096 ...
>  Default value is -1, will disable partial scan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15246) ArrayIndexOfboundsException in BlockManager CreateLocatedBlock

2020-05-28 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119220#comment-17119220
 ] 

hemanthboyina commented on HDFS-15246:
--

thanks [~elgoiri] for review 

i have updated the patch , please review

> ArrayIndexOfboundsException in BlockManager CreateLocatedBlock
> --
>
> Key: HDFS-15246
> URL: https://issues.apache.org/jira/browse/HDFS-15246
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15246-testrepro.patch, HDFS-15246.001.patch, 
> HDFS-15246.002.patch
>
>
> java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
>  
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1362)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:1501)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:179)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2047)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:770)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15246) ArrayIndexOfboundsException in BlockManager CreateLocatedBlock

2020-05-28 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15246:
-
Attachment: HDFS-15246.002.patch

> ArrayIndexOfboundsException in BlockManager CreateLocatedBlock
> --
>
> Key: HDFS-15246
> URL: https://issues.apache.org/jira/browse/HDFS-15246
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15246-testrepro.patch, HDFS-15246.001.patch, 
> HDFS-15246.002.patch
>
>
> java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
>  
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1362)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:1501)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:179)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2047)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:770)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15377) BlockScanner scans one part per round, expect full scans after several rounds

2020-05-28 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119218#comment-17119218
 ] 

Yang Yun commented on HDFS-15377:
-

Thanks [~elgoiri] for the review.

Updated to HDFS-15377.003.patch with following changes,
 * Add private method createBlockSender for the new code.
 * declare the variable 'curPosition' as AtomicLong for find bug issue.

> BlockScanner scans one part per round, expect full scans after several rounds
> -
>
> Key: HDFS-15377
> URL: https://issues.apache.org/jira/browse/HDFS-15377
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15377.002.patch, HDFS-15377.003.patch
>
>
> For reducing disk IO, one block is separated to multiple parts, BlockScanner 
> scans only one part per round. Expect that after several rounds, the full 
> block should be scanned
> Add a new option "dfs.block.scanner.part.size". the maximum data size per 
> scan by the block scanner. this value should be the multiple of chunk size, 
> for example, 512, 1024, 4096 ...
>  Default value is -1, will disable partial scan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15377) BlockScanner scans one part per round, expect full scans after several rounds

2020-05-28 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15377:

Attachment: HDFS-15377.003.patch
Status: Patch Available  (was: Open)

> BlockScanner scans one part per round, expect full scans after several rounds
> -
>
> Key: HDFS-15377
> URL: https://issues.apache.org/jira/browse/HDFS-15377
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15377.002.patch, HDFS-15377.003.patch
>
>
> For reducing disk IO, one block is separated to multiple parts, BlockScanner 
> scans only one part per round. Expect that after several rounds, the full 
> block should be scanned
> Add a new option "dfs.block.scanner.part.size". the maximum data size per 
> scan by the block scanner. this value should be the multiple of chunk size, 
> for example, 512, 1024, 4096 ...
>  Default value is -1, will disable partial scan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15377) BlockScanner scans one part per round, expect full scans after several rounds

2020-05-28 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15377:

Status: Open  (was: Patch Available)

> BlockScanner scans one part per round, expect full scans after several rounds
> -
>
> Key: HDFS-15377
> URL: https://issues.apache.org/jira/browse/HDFS-15377
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15377.002.patch
>
>
> For reducing disk IO, one block is separated to multiple parts, BlockScanner 
> scans only one part per round. Expect that after several rounds, the full 
> block should be scanned
> Add a new option "dfs.block.scanner.part.size". the maximum data size per 
> scan by the block scanner. this value should be the multiple of chunk size, 
> for example, 512, 1024, 4096 ...
>  Default value is -1, will disable partial scan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15168) ABFS driver enhancement - Allow customizable translation from AAD SPNs and security groups to Linux user and group

2020-05-28 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119215#comment-17119215
 ] 

Hudson commented on HDFS-15168:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18306 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18306/])
HDFS-15168: ABFS enhancement to translate AAD to Linux identities. (github: rev 
b2200a33a6cbb43998833d902578143f93bb192a)
* (edit) 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java
* (edit) 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/oauth2/IdentityTransformer.java
* (add) 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/oauth2/IdentityTransformerInterface.java
* (add) 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/oauth2/LocalIdentityTransformer.java
* (add) 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TextFileBasedIdentityHandler.java
* (add) 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/IdentityHandler.java
* (edit) 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/ConfigurationKeys.java
* (add) 
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/TestTextFileBasedIdentityHandler.java
* (edit) 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/AbfsHttpConstants.java


> ABFS driver enhancement - Allow customizable translation from AAD SPNs and 
> security groups to Linux user and group
> --
>
> Key: HDFS-15168
> URL: https://issues.apache.org/jira/browse/HDFS-15168
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fs/azure
>Reporter: Karthik Amarnath
>Assignee: Karthik Amarnath
>Priority: Major
>
> ABFS driver does not support the translation of AAD Service principal (SPI) 
> to Linux identities causing metadata operation failure. Hadoop MapReduce 
> client 
> [[JobSubmissionFiles|https://github.com/apache/hadoop/blob/d842dfffa53c8b565f3d65af44ccd7e1cc706733/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmissionFiles.java#L138]]
>  expects the file owner permission to be the Linux identity, but the 
> underlying ABFS driver returns the AAD Object identity. Hence need ABFS 
> driver enhancement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14960) TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology

2020-05-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119185#comment-17119185
 ] 

Hadoop QA commented on HDFS-14960:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 37m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
5m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
39s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
36s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 21 unchanged - 1 fixed = 21 total (was 22) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 19s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}123m 16s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}230m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy 
|
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor |
|   | hadoop.hdfs.qjournal.TestSecureNNWithQJM |
|   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestStripedFileAppend |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29380/artifact/out/Dockerfile
 |
| JIRA Issue | HDFS-14960 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13004291/HDFS-14960.005.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 9cb733e716ef 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 

[jira] [Commented] (HDFS-15378) TestReconstructStripedFile#testErasureCodingWorkerXmitsWeight is failing on trunk

2020-05-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119121#comment-17119121
 ] 

Hadoop QA commented on HDFS-15378:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
32s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}136m  3s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}217m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
|   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | hadoop.hdfs.TestDFSRemove |
|   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29379/artifact/out/Dockerfile
 |
| JIRA Issue | HDFS-15378 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13004285/HDFS-15378.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 4d019b82a23c 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 

[jira] [Commented] (HDFS-14960) TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology

2020-05-28 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119061#comment-17119061
 ] 

Jim Brennan commented on HDFS-14960:


Thanks for the review [~inigoiri]!  I've addressed all of your comments in 
patch 005.


> TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology
> 
>
> Key: HDFS-14960
> URL: https://issues.apache.org/jira/browse/HDFS-14960
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.3
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HDFS-14960.001.patch, HDFS-14960.002.patch, 
> HDFS-14960.003.patch, HDFS-14960.004.patch, HDFS-14960.005.patch
>
>
> As reported in HDFS-14958, TestBalancerWithNodeGroup was succeeding even 
> though it was using DFSNetworkTopology instead of 
> NetworkTopologyWithNodeGroup.
> [~inigoiri] rightly suggested that this indicates the test is not very good - 
> it should fail when run without NetworkTopologyWithNodeGroup.
> We should improve this test.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14960) TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology

2020-05-28 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HDFS-14960:
---
Attachment: HDFS-14960.005.patch

> TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology
> 
>
> Key: HDFS-14960
> URL: https://issues.apache.org/jira/browse/HDFS-14960
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.3
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HDFS-14960.001.patch, HDFS-14960.002.patch, 
> HDFS-14960.003.patch, HDFS-14960.004.patch, HDFS-14960.005.patch
>
>
> As reported in HDFS-14958, TestBalancerWithNodeGroup was succeeding even 
> though it was using DFSNetworkTopology instead of 
> NetworkTopologyWithNodeGroup.
> [~inigoiri] rightly suggested that this indicates the test is not very good - 
> it should fail when run without NetworkTopologyWithNodeGroup.
> We should improve this test.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15246) ArrayIndexOfboundsException in BlockManager CreateLocatedBlock

2020-05-28 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119019#comment-17119019
 ] 

Íñigo Goiri commented on HDFS-15246:


* Can we add some assert in the test?
* We may want to extract L654.

> ArrayIndexOfboundsException in BlockManager CreateLocatedBlock
> --
>
> Key: HDFS-15246
> URL: https://issues.apache.org/jira/browse/HDFS-15246
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15246-testrepro.patch, HDFS-15246.001.patch
>
>
> java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
>  
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1362)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:1501)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:179)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2047)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:770)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15377) BlockScanner scans one part per round, expect full scans after several rounds

2020-05-28 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119018#comment-17119018
 ] 

Íñigo Goiri commented on HDFS-15377:


* Most of the new code in scanBlock() should probably be a private method.
* We should fix the find bug issue, otherwise we will be flagged all the time.

> BlockScanner scans one part per round, expect full scans after several rounds
> -
>
> Key: HDFS-15377
> URL: https://issues.apache.org/jira/browse/HDFS-15377
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15377.002.patch
>
>
> For reducing disk IO, one block is separated to multiple parts, BlockScanner 
> scans only one part per round. Expect that after several rounds, the full 
> block should be scanned
> Add a new option "dfs.block.scanner.part.size". the maximum data size per 
> scan by the block scanner. this value should be the multiple of chunk size, 
> for example, 512, 1024, 4096 ...
>  Default value is -1, will disable partial scan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15366) Active NameNode went down with NPE

2020-05-28 Thread sarun singla (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119017#comment-17119017
 ] 

sarun singla commented on HDFS-15366:
-

[~hexiaoqiao] Let me confirm and get back.Thnx

> Active NameNode went down with NPE
> --
>
> Key: HDFS-15366
> URL: https://issues.apache.org/jira/browse/HDFS-15366
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.3
>Reporter: sarun singla
>Priority: Critical
>
> {code:java}
> 2020-05-12 00:31:54,565 ERROR blockmanagement.BlockManager 
> (BlockManager.java:run(3816)) - ReplicationMonitor thread received Runtime 
> exception.
> java.lang.NullPointerException
>  at org.apache.hadoop.hdfs.server.namenode.INode.getParent(INode.java:629)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getRelativePathINodes(FSDirectory.java:1009)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getFullPathINodes(FSDirectory.java:1015)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getFullPathName(FSDirectory.java:1020)
>  at 
> org.apache.hadoop.hdfs.server.namenode.INode.getFullPathName(INode.java:591)
>  at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.getName(INodeFile.java:550)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.chooseTargets(BlockManager.java:3912)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.access$200(BlockManager.java:3875)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1560)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1452)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3847)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3799)
>  at java.lang.Thread.run(Thread.java:748)
> 2020-05-12 00:31:54,567 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - 
> Exiting with status 1
> 2020-05-12 00:31:54,621 INFO namenode.NameNode (LogAdapter.java:info(47)) - 
> SHUTDOWN_MSG:
> /
> SHUTDOWN_MSG: Shutting down NameNode at xyz.com/xxx
> /{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13274) RBF: Extend RouterRpcClient to use multiple sockets

2020-05-28 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-13274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119013#comment-17119013
 ] 

Íñigo Goiri edited comment on HDFS-13274 at 5/28/20, 7:17 PM:
--

RpcServerNumOpenConnections seems pretty high, can we confirm that those 
connections are mainly because of renewLease() ops?
A potential solution is to make renewLease() more async or having a different 
pool for those so we don't affect regular ops.


was (Author: elgoiri):
RpcServerNumOpenConnections seems pretty high, can we confirm that those 
connections are mainly because of renewLease() ops?
A potential solution is to make renewLease() more sync or having a different 
pool for those so we don't affect regular ops.

> RBF: Extend RouterRpcClient to use multiple sockets
> ---
>
> Key: HDFS-13274
> URL: https://issues.apache.org/jira/browse/HDFS-13274
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>
> HADOOP-13144 introduces the ability to create multiple connections for the 
> same user and use different sockets. The RouterRpcClient should use this 
> approach to get a better throughput.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13274) RBF: Extend RouterRpcClient to use multiple sockets

2020-05-28 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-13274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119013#comment-17119013
 ] 

Íñigo Goiri commented on HDFS-13274:


RpcServerNumOpenConnections seems pretty high, can we confirm that those 
connections are mainly because of renewLease() ops?
A potential solution is to make renewLease() more sync or having a different 
pool for those so we don't affect regular ops.

> RBF: Extend RouterRpcClient to use multiple sockets
> ---
>
> Key: HDFS-13274
> URL: https://issues.apache.org/jira/browse/HDFS-13274
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>
> HADOOP-13144 introduces the ability to create multiple connections for the 
> same user and use different sockets. The RouterRpcClient should use this 
> approach to get a better throughput.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14960) TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology

2020-05-28 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119005#comment-17119005
 ] 

Íñigo Goiri commented on HDFS-14960:


Minor comments:
* As L198 only has one equals, let's use assertEquals().
* In verifyProperBlockPlacement(), the assertTrue() could give the block id 
when failing.
* Should verifyProperBlockPlacement() assert that there was at least a block to 
check? assertFalse(locatedBlocks.isEmpty())

> TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology
> 
>
> Key: HDFS-14960
> URL: https://issues.apache.org/jira/browse/HDFS-14960
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.3
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HDFS-14960.001.patch, HDFS-14960.002.patch, 
> HDFS-14960.003.patch, HDFS-14960.004.patch
>
>
> As reported in HDFS-14958, TestBalancerWithNodeGroup was succeeding even 
> though it was using DFSNetworkTopology instead of 
> NetworkTopologyWithNodeGroup.
> [~inigoiri] rightly suggested that this indicates the test is not very good - 
> it should fail when run without NetworkTopologyWithNodeGroup.
> We should improve this test.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15378) TestReconstructStripedFile#testErasureCodingWorkerXmitsWeight is failing on trunk

2020-05-28 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17119000#comment-17119000
 ] 

Íñigo Goiri commented on HDFS-15378:


Just for completion, why is there a delay?

> TestReconstructStripedFile#testErasureCodingWorkerXmitsWeight is failing on 
> trunk
> -
>
> Key: HDFS-15378
> URL: https://issues.apache.org/jira/browse/HDFS-15378
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Priority: Major
> Attachments: HDFS-15378.001.patch
>
>
> [https://builds.apache.org/job/PreCommit-HDFS-Build/29377/#showFailuresLink]
> [https://builds.apache.org/job/PreCommit-HDFS-Build/29368/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14960) TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology

2020-05-28 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118995#comment-17118995
 ] 

Jim Brennan commented on HDFS-14960:


The failed unit tests are unrelated to this change.


> TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology
> 
>
> Key: HDFS-14960
> URL: https://issues.apache.org/jira/browse/HDFS-14960
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.3
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HDFS-14960.001.patch, HDFS-14960.002.patch, 
> HDFS-14960.003.patch, HDFS-14960.004.patch
>
>
> As reported in HDFS-14958, TestBalancerWithNodeGroup was succeeding even 
> though it was using DFSNetworkTopology instead of 
> NetworkTopologyWithNodeGroup.
> [~inigoiri] rightly suggested that this indicates the test is not very good - 
> it should fail when run without NetworkTopologyWithNodeGroup.
> We should improve this test.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14960) TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology

2020-05-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118982#comment-17118982
 ] 

Hadoop QA commented on HDFS-14960:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m  
0s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
56s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 21 unchanged - 1 fixed = 21 total (was 22) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 49s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}180m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.TestReconstructStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29378/artifact/out/Dockerfile
 |
| JIRA Issue | HDFS-14960 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13004271/HDFS-14960.004.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux ed4b0d087e0a 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / a838d871a76 |
| Default Java | Private 

[jira] [Updated] (HDFS-15378) TestReconstructStripedFile#testErasureCodingWorkerXmitsWeight is failing on trunk

2020-05-28 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15378:
-
Attachment: HDFS-15378.001.patch
Status: Patch Available  (was: Open)

> TestReconstructStripedFile#testErasureCodingWorkerXmitsWeight is failing on 
> trunk
> -
>
> Key: HDFS-15378
> URL: https://issues.apache.org/jira/browse/HDFS-15378
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Priority: Major
> Attachments: HDFS-15378.001.patch
>
>
> [https://builds.apache.org/job/PreCommit-HDFS-Build/29377/#showFailuresLink]
> [https://builds.apache.org/job/PreCommit-HDFS-Build/29368/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15378) TestReconstructStripedFile#testErasureCodingWorkerXmitsWeight is failing on trunk

2020-05-28 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118963#comment-17118963
 ] 

hemanthboyina commented on HDFS-15378:
--

there was a slight delay in the test case run , if we wait for 
curDn.getXmitsInProgress() == 0 , the test case was success

> TestReconstructStripedFile#testErasureCodingWorkerXmitsWeight is failing on 
> trunk
> -
>
> Key: HDFS-15378
> URL: https://issues.apache.org/jira/browse/HDFS-15378
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Priority: Major
>
> [https://builds.apache.org/job/PreCommit-HDFS-Build/29377/#showFailuresLink]
> [https://builds.apache.org/job/PreCommit-HDFS-Build/29368/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15378) TestReconstructStripedFile#testErasureCodingWorkerXmitsWeight is failing on trunk

2020-05-28 Thread hemanthboyina (Jira)
hemanthboyina created HDFS-15378:


 Summary: 
TestReconstructStripedFile#testErasureCodingWorkerXmitsWeight is failing on 
trunk
 Key: HDFS-15378
 URL: https://issues.apache.org/jira/browse/HDFS-15378
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: hemanthboyina


[https://builds.apache.org/job/PreCommit-HDFS-Build/29377/#showFailuresLink]

[https://builds.apache.org/job/PreCommit-HDFS-Build/29368/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15368) TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally

2020-05-28 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118840#comment-17118840
 ] 

Hudson commented on HDFS-15368:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18305 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18305/])
HDFS-15368. TestBalancerWithHANameNodes#testBalancerWithObserver failed 
(ayushsaxena: rev a838d871a76776016703f6c904fb049be2247626)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithHANameNodes.java


> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally
> 
>
> Key: HDFS-15368
> URL: https://issues.apache.org/jira/browse/HDFS-15368
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: balancer, test
> Fix For: 3.4.0
>
> Attachments: HDFS-15368.001.patch, HDFS-15368.002.patch, 
> TestBalancerWithHANameNodes.testBalancerObserver.log, 
> TestBalancerWithHANameNodes.testBalancerObserver.log
>
>
> When I am working on HDFS-13183, I found that 
> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally, 
> because the following code segment. Consider there are 1 ANN + 1 SBN + 2ONN, 
> when invoke getBlocks with opening Observer Read feature, it could request 
> any one of two ObserverNN based on my observation. So only verify the first 
> ObserverNN and check times of invoke #getBlocks is not expected.
> {code:java}
>   for (int i = 0; i < cluster.getNumNameNodes(); i++) {
> // First observer node is at idx 2, or 3 if 2 has been shut down
> // It should get both getBlocks calls, all other NNs should see 0 
> calls
> int expectedObserverIdx = withObserverFailure ? 3 : 2;
> int expectedCount = (i == expectedObserverIdx) ? 2 : 0;
> verify(namesystemSpies.get(i), times(expectedCount))
> .getBlocks(any(), anyLong(), anyLong());
>   }
> {code}
> cc [~xkrogen],[~weichiu]. I am not very familiar for Observer Read feature, 
> would you like give some suggestions? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14960) TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology

2020-05-28 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118837#comment-17118837
 ] 

Jim Brennan commented on HDFS-14960:


Now that HDFS-13183 has been fixed, I uploaded patch 004 which is the same as 
patch 003, just rebased to the current trunk.


> TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology
> 
>
> Key: HDFS-14960
> URL: https://issues.apache.org/jira/browse/HDFS-14960
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.3
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HDFS-14960.001.patch, HDFS-14960.002.patch, 
> HDFS-14960.003.patch, HDFS-14960.004.patch
>
>
> As reported in HDFS-14958, TestBalancerWithNodeGroup was succeeding even 
> though it was using DFSNetworkTopology instead of 
> NetworkTopologyWithNodeGroup.
> [~inigoiri] rightly suggested that this indicates the test is not very good - 
> it should fail when run without NetworkTopologyWithNodeGroup.
> We should improve this test.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15368) TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally

2020-05-28 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118835#comment-17118835
 ] 

Xiaoqiao He commented on HDFS-15368:


Thanks [~ayushtkn] for your help and review to fix this failed test.

> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally
> 
>
> Key: HDFS-15368
> URL: https://issues.apache.org/jira/browse/HDFS-15368
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: balancer, test
> Fix For: 3.4.0
>
> Attachments: HDFS-15368.001.patch, HDFS-15368.002.patch, 
> TestBalancerWithHANameNodes.testBalancerObserver.log, 
> TestBalancerWithHANameNodes.testBalancerObserver.log
>
>
> When I am working on HDFS-13183, I found that 
> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally, 
> because the following code segment. Consider there are 1 ANN + 1 SBN + 2ONN, 
> when invoke getBlocks with opening Observer Read feature, it could request 
> any one of two ObserverNN based on my observation. So only verify the first 
> ObserverNN and check times of invoke #getBlocks is not expected.
> {code:java}
>   for (int i = 0; i < cluster.getNumNameNodes(); i++) {
> // First observer node is at idx 2, or 3 if 2 has been shut down
> // It should get both getBlocks calls, all other NNs should see 0 
> calls
> int expectedObserverIdx = withObserverFailure ? 3 : 2;
> int expectedCount = (i == expectedObserverIdx) ? 2 : 0;
> verify(namesystemSpies.get(i), times(expectedCount))
> .getBlocks(any(), anyLong(), anyLong());
>   }
> {code}
> cc [~xkrogen],[~weichiu]. I am not very familiar for Observer Read feature, 
> would you like give some suggestions? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14960) TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology

2020-05-28 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HDFS-14960:
---
Attachment: HDFS-14960.004.patch

> TestBalancerWithNodeGroup should not succeed with DFSNetworkTopology
> 
>
> Key: HDFS-14960
> URL: https://issues.apache.org/jira/browse/HDFS-14960
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.3
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HDFS-14960.001.patch, HDFS-14960.002.patch, 
> HDFS-14960.003.patch, HDFS-14960.004.patch
>
>
> As reported in HDFS-14958, TestBalancerWithNodeGroup was succeeding even 
> though it was using DFSNetworkTopology instead of 
> NetworkTopologyWithNodeGroup.
> [~inigoiri] rightly suggested that this indicates the test is not very good - 
> it should fail when run without NetworkTopologyWithNodeGroup.
> We should improve this test.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15366) Active NameNode went down with NPE

2020-05-28 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118831#comment-17118831
 ] 

Xiaoqiao He commented on HDFS-15366:


[~saruntek] Please help to check if HDFS-12832 could solve your issue for 
branch-2.7.3. If true, I would like to close this JIRA later.

> Active NameNode went down with NPE
> --
>
> Key: HDFS-15366
> URL: https://issues.apache.org/jira/browse/HDFS-15366
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.3
>Reporter: sarun singla
>Priority: Critical
>
> {code:java}
> 2020-05-12 00:31:54,565 ERROR blockmanagement.BlockManager 
> (BlockManager.java:run(3816)) - ReplicationMonitor thread received Runtime 
> exception.
> java.lang.NullPointerException
>  at org.apache.hadoop.hdfs.server.namenode.INode.getParent(INode.java:629)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getRelativePathINodes(FSDirectory.java:1009)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getFullPathINodes(FSDirectory.java:1015)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getFullPathName(FSDirectory.java:1020)
>  at 
> org.apache.hadoop.hdfs.server.namenode.INode.getFullPathName(INode.java:591)
>  at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.getName(INodeFile.java:550)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.chooseTargets(BlockManager.java:3912)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.access$200(BlockManager.java:3875)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1560)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1452)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3847)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3799)
>  at java.lang.Thread.run(Thread.java:748)
> 2020-05-12 00:31:54,567 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - 
> Exiting with status 1
> 2020-05-12 00:31:54,621 INFO namenode.NameNode (LogAdapter.java:info(47)) - 
> SHUTDOWN_MSG:
> /
> SHUTDOWN_MSG: Shutting down NameNode at xyz.com/xxx
> /{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15368) TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally

2020-05-28 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15368:

Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally
> 
>
> Key: HDFS-15368
> URL: https://issues.apache.org/jira/browse/HDFS-15368
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: balancer, test
> Fix For: 3.4.0
>
> Attachments: HDFS-15368.001.patch, HDFS-15368.002.patch, 
> TestBalancerWithHANameNodes.testBalancerObserver.log, 
> TestBalancerWithHANameNodes.testBalancerObserver.log
>
>
> When I am working on HDFS-13183, I found that 
> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally, 
> because the following code segment. Consider there are 1 ANN + 1 SBN + 2ONN, 
> when invoke getBlocks with opening Observer Read feature, it could request 
> any one of two ObserverNN based on my observation. So only verify the first 
> ObserverNN and check times of invoke #getBlocks is not expected.
> {code:java}
>   for (int i = 0; i < cluster.getNumNameNodes(); i++) {
> // First observer node is at idx 2, or 3 if 2 has been shut down
> // It should get both getBlocks calls, all other NNs should see 0 
> calls
> int expectedObserverIdx = withObserverFailure ? 3 : 2;
> int expectedCount = (i == expectedObserverIdx) ? 2 : 0;
> verify(namesystemSpies.get(i), times(expectedCount))
> .getBlocks(any(), anyLong(), anyLong());
>   }
> {code}
> cc [~xkrogen],[~weichiu]. I am not very familiar for Observer Read feature, 
> would you like give some suggestions? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15368) TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally

2020-05-28 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118824#comment-17118824
 ] 

Ayush Saxena commented on HDFS-15368:
-

Committed to trunk.
Thanx [~hexiaoqiao] for the contribution!!!

> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally
> 
>
> Key: HDFS-15368
> URL: https://issues.apache.org/jira/browse/HDFS-15368
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: balancer, test
> Attachments: HDFS-15368.001.patch, HDFS-15368.002.patch, 
> TestBalancerWithHANameNodes.testBalancerObserver.log, 
> TestBalancerWithHANameNodes.testBalancerObserver.log
>
>
> When I am working on HDFS-13183, I found that 
> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally, 
> because the following code segment. Consider there are 1 ANN + 1 SBN + 2ONN, 
> when invoke getBlocks with opening Observer Read feature, it could request 
> any one of two ObserverNN based on my observation. So only verify the first 
> ObserverNN and check times of invoke #getBlocks is not expected.
> {code:java}
>   for (int i = 0; i < cluster.getNumNameNodes(); i++) {
> // First observer node is at idx 2, or 3 if 2 has been shut down
> // It should get both getBlocks calls, all other NNs should see 0 
> calls
> int expectedObserverIdx = withObserverFailure ? 3 : 2;
> int expectedCount = (i == expectedObserverIdx) ? 2 : 0;
> verify(namesystemSpies.get(i), times(expectedCount))
> .getBlocks(any(), anyLong(), anyLong());
>   }
> {code}
> cc [~xkrogen],[~weichiu]. I am not very familiar for Observer Read feature, 
> would you like give some suggestions? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-05-28 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118797#comment-17118797
 ] 

Xiaoqiao He commented on HDFS-15180:


Thanks [~sodonnell] for your positive feedback. 
{quote}
we were hoping to get HDFS-15160 committed first, as its a smaller and possibly 
simpler change
{quote}
It makes sense to me. my colleague [~Aiphag0] and me would like to follow up 
and push this feature forward after HDFS-15160 has been ready. Welcome any 
suggestions and discussion if anyone is interested in this changes.

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: Aiphago
>Priority: Major
> Attachments: HDFS-15180.001.patch, HDFS-15180.002.patch, 
> HDFS-15180.003.patch, HDFS-15180.004.patch, 
> image-2020-03-10-17-22-57-391.png, image-2020-03-10-17-31-58-830.png, 
> image-2020-03-10-17-34-26-368.png, image-2020-04-09-11-20-36-459.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15368) TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally

2020-05-28 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118784#comment-17118784
 ] 

Xiaoqiao He commented on HDFS-15368:


Thanks [~ayushtkn], 
TestDecommissionWithBackoffMonitor.testAllocAndIBRWhileDecommission run passed 
at local. 
TestNameNodeRetryCacheMetrics.testRetryCacheMetrics failed occasionally, I 
think it is not related to this changes or HDFS-13183, but not find root cause 
currently. 
Other failed unit tests are both about EC, it is also not related IMO.

> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally
> 
>
> Key: HDFS-15368
> URL: https://issues.apache.org/jira/browse/HDFS-15368
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: balancer, test
> Attachments: HDFS-15368.001.patch, HDFS-15368.002.patch, 
> TestBalancerWithHANameNodes.testBalancerObserver.log, 
> TestBalancerWithHANameNodes.testBalancerObserver.log
>
>
> When I am working on HDFS-13183, I found that 
> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally, 
> because the following code segment. Consider there are 1 ANN + 1 SBN + 2ONN, 
> when invoke getBlocks with opening Observer Read feature, it could request 
> any one of two ObserverNN based on my observation. So only verify the first 
> ObserverNN and check times of invoke #getBlocks is not expected.
> {code:java}
>   for (int i = 0; i < cluster.getNumNameNodes(); i++) {
> // First observer node is at idx 2, or 3 if 2 has been shut down
> // It should get both getBlocks calls, all other NNs should see 0 
> calls
> int expectedObserverIdx = withObserverFailure ? 3 : 2;
> int expectedCount = (i == expectedObserverIdx) ? 2 : 0;
> verify(namesystemSpies.get(i), times(expectedCount))
> .getBlocks(any(), anyLong(), anyLong());
>   }
> {code}
> cc [~xkrogen],[~weichiu]. I am not very familiar for Observer Read feature, 
> would you like give some suggestions? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-05-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118692#comment-17118692
 ] 

Hadoop QA commented on HDFS-15160:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  4m  
4s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
59s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 29s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}188m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.TestStripedFileAppend |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29377/artifact/out/Dockerfile
 |
| JIRA Issue | HDFS-15160 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13004241/HDFS-15160.006.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 593e071b33cb 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 

[jira] [Commented] (HDFS-15368) TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally

2020-05-28 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118594#comment-17118594
 ] 

Ayush Saxena commented on HDFS-15368:
-

v002 LGTM +1
[~hexiaoqiao] can you confirm about the failed tests?

> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally
> 
>
> Key: HDFS-15368
> URL: https://issues.apache.org/jira/browse/HDFS-15368
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: balancer, test
> Attachments: HDFS-15368.001.patch, HDFS-15368.002.patch, 
> TestBalancerWithHANameNodes.testBalancerObserver.log, 
> TestBalancerWithHANameNodes.testBalancerObserver.log
>
>
> When I am working on HDFS-13183, I found that 
> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally, 
> because the following code segment. Consider there are 1 ANN + 1 SBN + 2ONN, 
> when invoke getBlocks with opening Observer Read feature, it could request 
> any one of two ObserverNN based on my observation. So only verify the first 
> ObserverNN and check times of invoke #getBlocks is not expected.
> {code:java}
>   for (int i = 0; i < cluster.getNumNameNodes(); i++) {
> // First observer node is at idx 2, or 3 if 2 has been shut down
> // It should get both getBlocks calls, all other NNs should see 0 
> calls
> int expectedObserverIdx = withObserverFailure ? 3 : 2;
> int expectedCount = (i == expectedObserverIdx) ? 2 : 0;
> verify(namesystemSpies.get(i), times(expectedCount))
> .getBlocks(any(), anyLong(), anyLong());
>   }
> {code}
> cc [~xkrogen],[~weichiu]. I am not very familiar for Observer Read feature, 
> would you like give some suggestions? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15368) TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally

2020-05-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118543#comment-17118543
 ] 

Hadoop QA commented on HDFS-15368:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m  
4s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
2s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 42s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}112m 52s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}186m 31s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDecommissionWithBackoffMonitor |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
|   | hadoop.hdfs.TestReconstructStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29376/artifact/out/Dockerfile
 |
| JIRA Issue | HDFS-15368 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13003870/HDFS-15368.002.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux fecdfed39c93 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 9b38be43c63 |
| Default Java | 

[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-05-28 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118537#comment-17118537
 ] 

Stephen O'Donnell commented on HDFS-15180:
--

[~hexiaoqiao] We are interested in this change, but we were hoping to get 
HDFS-15160 committed first, as its a smaller and possibly simpler change. 
HDFS-15160 is ready from a review point of view, but we are waiting for some 
tests on production clusters to see if it brings any problems.

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: Aiphago
>Priority: Major
> Attachments: HDFS-15180.001.patch, HDFS-15180.002.patch, 
> HDFS-15180.003.patch, HDFS-15180.004.patch, 
> image-2020-03-10-17-22-57-391.png, image-2020-03-10-17-31-58-830.png, 
> image-2020-03-10-17-34-26-368.png, image-2020-04-09-11-20-36-459.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-05-28 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-15160:
-
Attachment: HDFS-15160.006.patch

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> HDFS-15160.006.patch, image-2020-04-10-17-18-08-128.png, 
> image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-05-28 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118535#comment-17118535
 ] 

Stephen O'Donnell commented on HDFS-15160:
--

Hi [~Jiang Xin] - it would be a great help if you could try this on your 
production cluster. From review we feel the 005 patch is good, but real world 
testing would give us more confidence.

{quote}
But the code `synchronized(replica) ... ` in method getBlockLocalPathInfo 
confuse me. It's holding the write lock, doesn't need to worry about updating 
genStamps . I assume that you wanted to change method getBlockLocalPathInfo in 
read lock, am I right?
{quote}

Well spotted - you are correct. I have uploaded a 006 patch to make the lock in 
getBlockLocalPathInfo a readlock.


> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> image-2020-04-10-17-18-08-128.png, image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14320) Support skipTrash for WebHDFS

2020-05-28 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118235#comment-17118235
 ] 

Chao Sun edited comment on HDFS-14320 at 5/28/20, 10:16 AM:


Bumping up this as this seems to be an important feature. Curious what is the 
current status [~kpalanisamy], [~weichiu].


was (Author: csun):
Bumping up this as this seems to be an important feature. Curious what is the 
current status [~weichiu].

> Support skipTrash for WebHDFS 
> --
>
> Key: HDFS-14320
> URL: https://issues.apache.org/jira/browse/HDFS-14320
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode, webhdfs
>Affects Versions: 3.2.0
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
> Attachments: HDFS-14320-001.patch, HDFS-14320-002.patch, 
> HDFS-14320-003.patch, HDFS-14320-004.patch, HDFS-14320-005.patch, 
> HDFS-14320-006.patch, HDFS-14320-007.patch, HDFS-14320-008.patch
>
>
> Files/Directories deleted via webhdfs rest call doesn't use the skiptrash 
> feature, it would be deleted permanently. This feature is very important us 
> because our user has deleted large directory accidentally.
> By default, Skiptrash option is set to true, skiptrash=true. Any files, Using 
> CURL will be permanently deleted.
> Example:
> curl -iv -X DELETE 
> "http://:50070/webhdfs/v1/tmp/sampledata?op=DELETE=hdfs=true;
>  
> Use skiptrash=false, to move files to trash Instead.
> Example:
> curl -iv -X DELETE 
> "http://:50070/webhdfs/v1/tmp/sampledata?op=DELETE=hdfs=true=false;
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14773) SWEBHDFS closes the connection before a client can read the error response for a DSQuotaExceededException

2020-05-28 Thread Zhao Yi Ming (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118518#comment-17118518
 ] 

Zhao Yi Ming edited comment on HDFS-14773 at 5/28/20, 10:05 AM:


[~simbadzina] I tried the recreate steps(hadoop version is 3.1.1) seems it can 
return the correct response.  Can you help have a look if anything I missed? 
Thanks!

 

 
{code:java}
$ curl -L -i -X PUT -T file 
"https://:50470/webhdfs/v1/quota/file?op=CREATE" --cacert 
/data/zhaoyim/ssl/test_ca_cert

HTTP/1.1 100 ContinueHTTP/1.1 307 Temporary Redirect
Date: Thu, 28 May 2020 09:55:37 GMT
Cache-Control: no-cache
Expires: Thu, 28 May 2020 09:55:37 GMT
Date: Thu, 28 May 2020 09:55:37 GMT
Pragma: no-cache
X-FRAME-OPTIONS: SAMEORIGIN
Location: 
https://:50475/webhdfs/v1/quota/install_solr_service.sh?op=CREATE=:8020==true=false
Content-Type: application/octet-stream
Content-Length: 0HTTP/1.1 100 ContinueHTTP/1.1 403 Forbidden
Content-Type: application/json; charset=utf-8
Content-Length: 2171
Connection: close

{{ "RemoteException": { "exception": "DSQuotaExceededException", 
"javaClassName": "org.apache.hadoop.hdfs.protocol.DSQuotaExceededException", 
"message": "The DiskSpace quota of /quota is exceeded: quota = 1024 B = 1 KB 
but diskspace consumed = 134217728 B = 128 MB\n\tat 
org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyStoragespaceQuota(DirectoryWithQuotaFeature.java:195)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:222)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1154)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:986)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:945)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.addBlock(FSDirWriteFileOp.java:504)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.saveAllocatedBlock(FSDirWriteFileOp.java:771)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.storeAllocatedBlock(FSDirWriteFileOp.java:259)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2714)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:875)\n\tat
 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:561)\n\tat
 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat
 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat
 org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat 
java.security.AccessController.doPrivileged(Native Method)\n\tat 
javax.security.auth.Subject.doAs(Subject.java:422)\n\tat 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat
 org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n" }}

{code}
 

BTW: swebhdfs worked well on my env:
{code:java}
$ hdfs dfs -ls swebhdfs://:50470/quota
Found 1 items
-rw-r--r--   1 dr.who hdfs  0 2020-05-28 17:55 
swebhdfs://:50470/quota/install_solr_service.sh
$
{code}


was (Author: zhaoyim):
[~simbadzina] I tried the recreate steps(hadoop version is 3.1.1) seems it can 
return the correct response.  Can you help have a look if anything I missed? 
Thanks!

 

 
{code:java}
$ curl -L -i -X PUT -T file 
"https://:50470/webhdfs/v1/quota/file?op=CREATE" --cacert 
/data/zhaoyim/ssl/test_ca_cert

HTTP/1.1 100 ContinueHTTP/1.1 307 Temporary Redirect
Date: Thu, 28 May 2020 09:55:37 GMT
Cache-Control: no-cache
Expires: Thu, 28 May 2020 09:55:37 GMT
Date: Thu, 28 May 2020 09:55:37 GMT
Pragma: no-cache
X-FRAME-OPTIONS: SAMEORIGIN
Location: 
https://:50475/webhdfs/v1/quota/install_solr_service.sh?op=CREATE=:8020==true=false
Content-Type: application/octet-stream
Content-Length: 0HTTP/1.1 100 ContinueHTTP/1.1 403 Forbidden
Content-Type: application/json; charset=utf-8
Content-Length: 2171
Connection: close

{{ "RemoteException": { "exception": "DSQuotaExceededException", 
"javaClassName": "org.apache.hadoop.hdfs.protocol.DSQuotaExceededException", 
"message": "The DiskSpace quota of /quota is exceeded: quota = 1024 B = 1 KB 
but diskspace consumed = 134217728 B = 128 MB\n\tat 
org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyStoragespaceQuota(DirectoryWithQuotaFeature.java:195)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:222)\n\tat
 

[jira] [Commented] (HDFS-14773) SWEBHDFS closes the connection before a client can read the error response for a DSQuotaExceededException

2020-05-28 Thread Zhao Yi Ming (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118518#comment-17118518
 ] 

Zhao Yi Ming commented on HDFS-14773:
-

[~simbadzina] I tried the recreate steps(hadoop version is 3.1.1) seems it can 
return the correct response.  Can you help have a look if anything I missed? 
Thanks!

 

 
{code:java}
$ curl -L -i -X PUT -T file 
"https://:50470/webhdfs/v1/quota/file?op=CREATE" --cacert 
/data/zhaoyim/ssl/test_ca_cert

HTTP/1.1 100 ContinueHTTP/1.1 307 Temporary Redirect
Date: Thu, 28 May 2020 09:55:37 GMT
Cache-Control: no-cache
Expires: Thu, 28 May 2020 09:55:37 GMT
Date: Thu, 28 May 2020 09:55:37 GMT
Pragma: no-cache
X-FRAME-OPTIONS: SAMEORIGIN
Location: 
https://:50475/webhdfs/v1/quota/install_solr_service.sh?op=CREATE=:8020==true=false
Content-Type: application/octet-stream
Content-Length: 0HTTP/1.1 100 ContinueHTTP/1.1 403 Forbidden
Content-Type: application/json; charset=utf-8
Content-Length: 2171
Connection: close

{{ "RemoteException": { "exception": "DSQuotaExceededException", 
"javaClassName": "org.apache.hadoop.hdfs.protocol.DSQuotaExceededException", 
"message": "The DiskSpace quota of /quota is exceeded: quota = 1024 B = 1 KB 
but diskspace consumed = 134217728 B = 128 MB\n\tat 
org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyStoragespaceQuota(DirectoryWithQuotaFeature.java:195)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:222)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1154)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:986)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:945)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.addBlock(FSDirWriteFileOp.java:504)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.saveAllocatedBlock(FSDirWriteFileOp.java:771)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.storeAllocatedBlock(FSDirWriteFileOp.java:259)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2714)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:875)\n\tat
 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:561)\n\tat
 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat
 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat
 org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat 
java.security.AccessController.doPrivileged(Native Method)\n\tat 
javax.security.auth.Subject.doAs(Subject.java:422)\n\tat 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat
 org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n" }}

{code}
 

> SWEBHDFS closes the connection before a client can read the error response 
> for a DSQuotaExceededException
> -
>
> Key: HDFS-14773
> URL: https://issues.apache.org/jira/browse/HDFS-14773
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Simbarashe Dzinamarira
>Priority: Major
> Attachments: HDFS-14773-failing-test.patch
>
>
> When a DSQuotaExceededException is encountered using swebhdfs, the connection 
> is closed before the client can read the error response. This does not happen 
> for webhdfs.
> Attached is a patch for a test case that exposes the bug.
> You can recreate the bug on a live cluster using the steps below.
> *1) Create a directory and set a space quota*
> hdfs mkdir 
> hdfs dfsadmin -setSpaceQuota   
> *2) Write a file whose size exceeds the quota, using swebhdfs.*
> curl -L -i --negotiate -u : -X PUT -T largeFile 
> ":/webhdfs/v1//largeFile?op=CREATE"
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13274) RBF: Extend RouterRpcClient to use multiple sockets

2020-05-28 Thread xuzq (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118439#comment-17118439
 ] 

xuzq commented on HDFS-13274:
-

Maybe we can change RenewLease Rpc, bring the listing of writing files to 
Router.

> RBF: Extend RouterRpcClient to use multiple sockets
> ---
>
> Key: HDFS-13274
> URL: https://issues.apache.org/jira/browse/HDFS-13274
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>
> HADOOP-13144 introduces the ability to create multiple connections for the 
> same user and use different sockets. The RouterRpcClient should use this 
> approach to get a better throughput.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13274) RBF: Extend RouterRpcClient to use multiple sockets

2020-05-28 Thread Janus Chow (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118425#comment-17118425
 ] 

Janus Chow commented on HDFS-13274:
---

Yes, all the routers have similar numbers.

> RBF: Extend RouterRpcClient to use multiple sockets
> ---
>
> Key: HDFS-13274
> URL: https://issues.apache.org/jira/browse/HDFS-13274
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>
> HADOOP-13144 introduces the ability to create multiple connections for the 
> same user and use different sockets. The RouterRpcClient should use this 
> approach to get a better throughput.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13183) Standby NameNode process getBlocks request to reduce Active load

2020-05-28 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118421#comment-17118421
 ] 

Hudson commented on HDFS-13183:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18304 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18304/])
HDFS-13183. Addendum: Standby NameNode process getBlocks request to 
(ayushsaxena: rev 9b38be43c6323077a7be14e1295ad484c4038372)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java


> Standby NameNode process getBlocks request to reduce Active load
> 
>
> Key: HDFS-13183
> URL: https://issues.apache.org/jira/browse/HDFS-13183
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer  mover, namenode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-13183-trunk.001.patch, HDFS-13183-trunk.002.patch, 
> HDFS-13183-trunk.003.patch, HDFS-13183.004.patch, HDFS-13183.005.patch, 
> HDFS-13183.006.patch, HDFS-13183.007.patch, HDFS-13183.addendum.patch, 
> HDFS-13183.addendum.patch
>
>
> The performance of Active NameNode could be impact when {{Balancer}} requests 
> #getBlocks, since query blocks of overly full DNs performance is extremely 
> inefficient currently. The main reason is {{NameNodeRpcServer#getBlocks}} 
> hold read lock for long time. In extreme case, all handlers of Active 
> NameNode RPC server are occupied by one reader 
> {{NameNodeRpcServer#getBlocks}} and other write operation calls, thus Active 
> NameNode enter a state of false death for number of seconds even for minutes.
> The similar performance concerns of Balancer have reported by HDFS-9412, 
> HDFS-7967, etc.
> If Standby NameNode can shoulder #getBlocks heavy burden, it could speed up 
> the progress of balancing and reduce performance impact to Active NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13274) RBF: Extend RouterRpcClient to use multiple sockets

2020-05-28 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118420#comment-17118420
 ] 

Ayush Saxena commented on HDFS-13274:
-

GetServerDefaults doesn't goes to all Namespaces as of now, it goes to the 
Default NS if available else on any one of the available NS. We have cached 
that too in HDFS-15096.
RenewLease, I don't think we can change anything here, it is bound to go to all 
namespaces. and if one NS is slow, it is bound to suffer. May be some 
configuration tuning to change lease times and stuff can be done, depending 
upon the use case.
GetListing may take time if the list is tend to include mount entries, say if 
you are listing on / and have bunch of mount entries, since the number of 
children, permissions and all are need to fetched out from the Namenode and 
then the entry needs to be recreated. else if it is just a proxy, no mount 
entries, it shouldn't take much time. If you are having multiple destinations 
for mount points, it would take even more time, if listing is to include mount 
entries.

You said there are 16 Routers, do all of them are having similar numbers?

> RBF: Extend RouterRpcClient to use multiple sockets
> ---
>
> Key: HDFS-13274
> URL: https://issues.apache.org/jira/browse/HDFS-13274
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>
> HADOOP-13144 introduces the ability to create multiple connections for the 
> same user and use different sockets. The RouterRpcClient should use this 
> approach to get a better throughput.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14984) HDFS setQuota: Error message should be added for invalid input max range value to hdfs dfsadmin -setQuota command

2020-05-28 Thread Zhao Yi Ming (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118413#comment-17118413
 ] 

Zhao Yi Ming commented on HDFS-14984:
-

After some deeply investigation, I found DFSClient.java code did NOT have the 
problem, because the set quota and clear quota need to call this method, the 
Long.MAX_VALUE is internal used for the quota default value and not set the 
quota( keep the old values), so just add more details for the judgement and 
separate it, one for ns, another for ss. Following is the test results:

 
{code:java}
$ hdfs dfsadmin -setQuota 0 /quota
setQuota: Invalid values for namespace quota : 0
Usage: hdfs dfsadmin [-setQuota  ...]

$ hdfs dfsadmin -setQuota 9223372036854775807 /quota
WARN: "9223372036854775807" means QUOTA_DONT_SET, quota will not be set, it 
keep the old values.

$ hdfs dfsadmin -setQuota 9223372036854775808 /quota
setQuota: "9223372036854775808" is not a valid value for a quota.
Usage: hdfs dfsadmin [-setQuota  ...]

$ hdfs dfsadmin -setSpaceQuota 9223372036854775808 /quota
setSpaceQuota: "9223372036854775808" is not a valid value for a quota.
Usage: hdfs dfsadmin [-setSpaceQuota  [-storageType ] 
...]

$ hdfs dfsadmin -setSpaceQuota 9223372036854775807 /quota
WARN: "9223372036854775807" means QUOTA_DONT_SET, quota will not be set, it 
keep the old values.
$
{code}
 

[~hemanthboyina] [~SouryakantaDwivedy] Please help review. Thanks!

 

> HDFS setQuota: Error message should be added for invalid input max range 
> value to hdfs dfsadmin -setQuota command
> -
>
> Key: HDFS-14984
> URL: https://issues.apache.org/jira/browse/HDFS-14984
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: Souryakanta Dwivedy
>Assignee: Zhao Yi Ming
>Priority: Minor
> Attachments: image-2019-11-13-14-05-19-603.png, 
> image-2019-11-13-14-07-04-536.png
>
>
> An error message should be added for invalid input max range value 
> "9223372036854775807" to hdfs dfsadmin -setQuota command
>  * set quota for a directory with invalid input vlaue as 
> "9223372036854775807"- set quota for a directory with invalid input vlaue as 
> "9223372036854775807"   the command will be successful without displaying any 
> result.Quota value    will not be set for the directory internally,but it 
> will be better from user usage point of view  if an error message will 
> display for the invalid max range value "9223372036854775807" as it is 
> displaying    while setting the input value as "0"   For example "hdfs 
> dfsadmin -setQuota  9223372036854775807 /quota"        
>              !image-2019-11-13-14-05-19-603.png!
>  
>  *   - Try to set quota for a directory with invalid input value as "0"   It 
> will throw an error message as "setQuota: Invalid values for quota : 0 and 
> 9223372036854775807"       For example "hdfs dfsadmin -setQuota 0 /quota" 
>           !image-2019-11-13-14-07-04-536.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15368) TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally

2020-05-28 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118411#comment-17118411
 ] 

Ayush Saxena commented on HDFS-15368:
-

Thanx [~hexiaoqiao] for confirmation, Have pushed the addendum and triggered 
the build again. 

> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally
> 
>
> Key: HDFS-15368
> URL: https://issues.apache.org/jira/browse/HDFS-15368
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: balancer, test
> Attachments: HDFS-15368.001.patch, HDFS-15368.002.patch, 
> TestBalancerWithHANameNodes.testBalancerObserver.log, 
> TestBalancerWithHANameNodes.testBalancerObserver.log
>
>
> When I am working on HDFS-13183, I found that 
> TestBalancerWithHANameNodes#testBalancerWithObserver failed occasionally, 
> because the following code segment. Consider there are 1 ANN + 1 SBN + 2ONN, 
> when invoke getBlocks with opening Observer Read feature, it could request 
> any one of two ObserverNN based on my observation. So only verify the first 
> ObserverNN and check times of invoke #getBlocks is not expected.
> {code:java}
>   for (int i = 0; i < cluster.getNumNameNodes(); i++) {
> // First observer node is at idx 2, or 3 if 2 has been shut down
> // It should get both getBlocks calls, all other NNs should see 0 
> calls
> int expectedObserverIdx = withObserverFailure ? 3 : 2;
> int expectedCount = (i == expectedObserverIdx) ? 2 : 0;
> verify(namesystemSpies.get(i), times(expectedCount))
> .getBlocks(any(), anyLong(), anyLong());
>   }
> {code}
> cc [~xkrogen],[~weichiu]. I am not very familiar for Observer Read feature, 
> would you like give some suggestions? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13183) Standby NameNode process getBlocks request to reduce Active load

2020-05-28 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13183:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Standby NameNode process getBlocks request to reduce Active load
> 
>
> Key: HDFS-13183
> URL: https://issues.apache.org/jira/browse/HDFS-13183
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer  mover, namenode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-13183-trunk.001.patch, HDFS-13183-trunk.002.patch, 
> HDFS-13183-trunk.003.patch, HDFS-13183.004.patch, HDFS-13183.005.patch, 
> HDFS-13183.006.patch, HDFS-13183.007.patch, HDFS-13183.addendum.patch, 
> HDFS-13183.addendum.patch
>
>
> The performance of Active NameNode could be impact when {{Balancer}} requests 
> #getBlocks, since query blocks of overly full DNs performance is extremely 
> inefficient currently. The main reason is {{NameNodeRpcServer#getBlocks}} 
> hold read lock for long time. In extreme case, all handlers of Active 
> NameNode RPC server are occupied by one reader 
> {{NameNodeRpcServer#getBlocks}} and other write operation calls, thus Active 
> NameNode enter a state of false death for number of seconds even for minutes.
> The similar performance concerns of Balancer have reported by HDFS-9412, 
> HDFS-7967, etc.
> If Standby NameNode can shoulder #getBlocks heavy burden, it could speed up 
> the progress of balancing and reduce performance impact to Active NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13183) Standby NameNode process getBlocks request to reduce Active load

2020-05-28 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118409#comment-17118409
 ] 

Ayush Saxena commented on HDFS-13183:
-

Committed addendum to trunk and branch-3.3
Thanx [~hexiaoqiao] for the contribution and [~Jim_Brennan] for the review!!!

> Standby NameNode process getBlocks request to reduce Active load
> 
>
> Key: HDFS-13183
> URL: https://issues.apache.org/jira/browse/HDFS-13183
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer  mover, namenode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-13183-trunk.001.patch, HDFS-13183-trunk.002.patch, 
> HDFS-13183-trunk.003.patch, HDFS-13183.004.patch, HDFS-13183.005.patch, 
> HDFS-13183.006.patch, HDFS-13183.007.patch, HDFS-13183.addendum.patch, 
> HDFS-13183.addendum.patch
>
>
> The performance of Active NameNode could be impact when {{Balancer}} requests 
> #getBlocks, since query blocks of overly full DNs performance is extremely 
> inefficient currently. The main reason is {{NameNodeRpcServer#getBlocks}} 
> hold read lock for long time. In extreme case, all handlers of Active 
> NameNode RPC server are occupied by one reader 
> {{NameNodeRpcServer#getBlocks}} and other write operation calls, thus Active 
> NameNode enter a state of false death for number of seconds even for minutes.
> The similar performance concerns of Balancer have reported by HDFS-9412, 
> HDFS-7967, etc.
> If Standby NameNode can shoulder #getBlocks heavy burden, it could speed up 
> the progress of balancing and reduce performance impact to Active NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-14984) HDFS setQuota: Error message should be added for invalid input max range value to hdfs dfsadmin -setQuota command

2020-05-28 Thread Zhao Yi Ming (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-14984 started by Zhao Yi Ming.
---
> HDFS setQuota: Error message should be added for invalid input max range 
> value to hdfs dfsadmin -setQuota command
> -
>
> Key: HDFS-14984
> URL: https://issues.apache.org/jira/browse/HDFS-14984
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: Souryakanta Dwivedy
>Assignee: Zhao Yi Ming
>Priority: Minor
> Attachments: image-2019-11-13-14-05-19-603.png, 
> image-2019-11-13-14-07-04-536.png
>
>
> An error message should be added for invalid input max range value 
> "9223372036854775807" to hdfs dfsadmin -setQuota command
>  * set quota for a directory with invalid input vlaue as 
> "9223372036854775807"- set quota for a directory with invalid input vlaue as 
> "9223372036854775807"   the command will be successful without displaying any 
> result.Quota value    will not be set for the directory internally,but it 
> will be better from user usage point of view  if an error message will 
> display for the invalid max range value "9223372036854775807" as it is 
> displaying    while setting the input value as "0"   For example "hdfs 
> dfsadmin -setQuota  9223372036854775807 /quota"        
>              !image-2019-11-13-14-05-19-603.png!
>  
>  *   - Try to set quota for a directory with invalid input value as "0"   It 
> will throw an error message as "setQuota: Invalid values for quota : 0 and 
> 9223372036854775807"       For example "hdfs dfsadmin -setQuota 0 /quota" 
>           !image-2019-11-13-14-07-04-536.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-05-28 Thread Jiang Xin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118394#comment-17118394
 ] 

Jiang Xin edited comment on HDFS-15160 at 5/28/20, 7:28 AM:


[~sodonnell]  Thanks for your great job. 

I'm going to apply patch 005 and run on our production cluster. But the code 
`synchronized(replica) ... ` in method getBlockLocalPathInfo confuse me. It's 
holding the write lock, doesn't need to worry about updating genStamps . I 
assume that you wanted to change method getBlockLocalPathInfo in read lock, am 
I right?

Thanks


was (Author: jiang xin):
[~sodonnell]  I'm going to apply patch 005 and run on our production cluster. 
But the code `synchronized(replica) ... ` in method getBlockLocalPathInfo 
confuse me. It's holding the write lock, doesn't need to worry about updating 
genStamps . I assume that you wanted to change method getBlockLocalPathInfo in 
read lock, am I right?

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> image-2020-04-10-17-18-08-128.png, image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-05-28 Thread Jiang Xin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118394#comment-17118394
 ] 

Jiang Xin commented on HDFS-15160:
--

[~sodonnell]  I'm going to apply patch 005 and run on our production cluster. 
But the code `synchronized(replica) ... ` in method getBlockLocalPathInfo 
confuse me. It's holding the write lock, doesn't need to worry about updating 
genStamps . I assume that you wanted to change method getBlockLocalPathInfo in 
read lock, am I right?

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> image-2020-04-10-17-18-08-128.png, image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org