date:20191113

[jira] [Commented] (HDFS-14924) RenameSnapshot not updating new modification time

2019-11-13 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973999#comment-16973999
 ] 

Hadoop QA commented on HDFS-14924:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
51s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 38s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 622 unchanged - 0 fixed = 623 total (was 622) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}101m 45s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
59s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap |
|   | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14924 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12985812/HDFS-14924.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d02ff0ab9add 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 73a386a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28310/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28310/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test

[jira] [Updated] (HDDS-2478) Sonar : remove temporary variable in XceiverClientSpi.sendCommand

2019-11-13 Thread Supratim Deka (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Supratim Deka updated HDDS-2478:

Description: 
Sonar issues :
https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1
https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV2=AW5md_AGKcVY8lQ4ZsV2


  was:
Sonar issue :
https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1


> Sonar : remove temporary variable in XceiverClientSpi.sendCommand
> -
>
> Key: HDDS-2478
> URL: https://issues.apache.org/jira/browse/HDDS-2478
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Minor
>
> Sonar issues :
> https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1
> https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV2=AW5md_AGKcVY8lQ4ZsV2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2478) Sonar : remove temporary variable in XceiverClientGrpc.sendCommand

2019-11-13 Thread Supratim Deka (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Supratim Deka updated HDDS-2478:

Summary: Sonar : remove temporary variable in XceiverClientGrpc.sendCommand 
 (was: Sonar : remove temporary variable in XceiverClientSpi.sendCommand)

> Sonar : remove temporary variable in XceiverClientGrpc.sendCommand
> --
>
> Key: HDDS-2478
> URL: https://issues.apache.org/jira/browse/HDDS-2478
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Minor
>
> Sonar issues :
> https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1
> https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV2=AW5md_AGKcVY8lQ4ZsV2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2480) Sonar : remove log spam for exceptions inside XceiverClientGrpc.reconnect

2019-11-13 Thread Supratim Deka (Jira)

Supratim Deka created HDDS-2480:
---

 Summary: Sonar : remove log spam for exceptions inside 
XceiverClientGrpc.reconnect
 Key: HDDS-2480
 URL: https://issues.apache.org/jira/browse/HDDS-2480
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: SCM
Reporter: Supratim Deka
Assignee: Supratim Deka


Sonar issue:
https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsWE=AW5md_AGKcVY8lQ4ZsWE




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2479) Sonar : replace instanceof with catch block in XceiverClientGrpc.sendCommandWithRetry

2019-11-13 Thread Supratim Deka (Jira)

Supratim Deka created HDDS-2479:
---

 Summary: Sonar : replace instanceof with catch block in 
XceiverClientGrpc.sendCommandWithRetry
 Key: HDDS-2479
 URL: https://issues.apache.org/jira/browse/HDDS-2479
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: SCM
Reporter: Supratim Deka
Assignee: Supratim Deka


Sonar issue:
https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV_=AW5md_AGKcVY8lQ4ZsV_




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2478) Sonar : remove temporary variable in XceiverClientSpi.sendCommand

2019-11-13 Thread Supratim Deka (Jira)

Supratim Deka created HDDS-2478:
---

 Summary: Sonar : remove temporary variable in 
XceiverClientSpi.sendCommand
 Key: HDDS-2478
 URL: https://issues.apache.org/jira/browse/HDDS-2478
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: SCM
Reporter: Supratim Deka
Assignee: Supratim Deka


Sonar issue :
https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2473?focusedWorklogId=343136=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343136
 ]

ASF GitHub Bot logged work on HDDS-2473:


Author: ASF GitHub Bot
Created on: 14/Nov/19 05:24
Start Date: 14/Nov/19 05:24
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on pull request #162: HDDS-2473. 
Fix code reliability issues found by Sonar in Ozone Recon module.
URL: https://github.com/apache/hadoop-ozone/pull/162
 
 
   ## What changes were proposed in this pull request?
   Sonarcloud.io has flagged a number of code reliability issues in Ozone recon 
(https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon).
   
   Following issues have been fixed.
   - Double Brace Initialization should not be used
   - Resources should be closed
   - InterruptedException should not be ignored
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-2473
   
   ## How was this patch tested?
   Ran unit tests in 'recon' module.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 343136)
Remaining Estimate: 0h
Time Spent: 10m

> Fix code reliability issues found by Sonar in Ozone Recon module.
> -
>
> Key: HDDS-2473
> URL: https://issues.apache.org/jira/browse/HDDS-2473
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Recon
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> sonarcloud.io has flagged a number of code reliability issues in Ozone recon 
> (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon).
> Following issues will be triaged / fixed.
> * Double Brace Initialization should not be used
> * Resources should be closed
> * InterruptedException should not be ignored



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2473:
-
Labels: pull-request-available  (was: )

> Fix code reliability issues found by Sonar in Ozone Recon module.
> -
>
> Key: HDDS-2473
> URL: https://issues.apache.org/jira/browse/HDDS-2473
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Recon
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>
> sonarcloud.io has flagged a number of code reliability issues in Ozone recon 
> (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon).
> Following issues will be triaged / fixed.
> * Double Brace Initialization should not be used
> * Resources should be closed
> * InterruptedException should not be ignored



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2472?focusedWorklogId=343135=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343135
 ]

ASF GitHub Bot logged work on HDDS-2472:


Author: ASF GitHub Bot
Created on: 14/Nov/19 05:22
Start Date: 14/Nov/19 05:22
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on pull request #161: HDDS-2472. 
Use try-with-resources while creating FlushOptions in RDBStore
URL: https://github.com/apache/hadoop-ozone/pull/161
 
 
   ## What changes were proposed in this pull request?
   Use try-with-resources while creating FlushOptions in RDBStore class. Remove 
code duplication in getCheckpoint method.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-2472
   
   ## How was this patch tested?
   Unit tested.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 343135)
Time Spent: 0.5h  (was: 20m)

> Use try-with-resources while creating FlushOptions in RDBStore.
> ---
>
> Key: HDDS-2472
> URL: https://issues.apache.org/jira/browse/HDDS-2472
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Link to the sonar issue flag - 
> https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14924) RenameSnapshot not updating new modification time

2019-11-13 Thread hemanthboyina (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973933#comment-16973933
 ] 

hemanthboyina commented on HDFS-14924:
--

there was a  similar kind of issue with createSnapshot in HDFS-14922 .

updated the patch , please review .

> RenameSnapshot not updating new modification time
> -
>
> Key: HDFS-14924
> URL: https://issues.apache.org/jira/browse/HDFS-14924
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14924.001.patch, HDFS-14924.002.patch
>
>
> RenameSnapshot doesnt updating modification time



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14924) RenameSnapshot not updating new modification time

2019-11-13 Thread hemanthboyina (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-14924:
-
Attachment: HDFS-14924.002.patch

> RenameSnapshot not updating new modification time
> -
>
> Key: HDFS-14924
> URL: https://issues.apache.org/jira/browse/HDFS-14924
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14924.001.patch, HDFS-14924.002.patch
>
>
> RenameSnapshot doesnt updating modification time



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-14987) EC: EC file blockId location info displaying as "null" with hdfs fsck -blockId command

2019-11-13 Thread Ravuri Sushma sree (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree reassigned HDFS-14987:
-

Assignee: Ravuri Sushma sree

> EC: EC file blockId location info displaying as "null" with hdfs fsck 
> -blockId command
> --
>
> Key: HDFS-14987
> URL: https://issues.apache.org/jira/browse/HDFS-14987
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, tools
>Affects Versions: 3.1.2
>Reporter: Souryakanta Dwivedy
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: EC_file_block_info.PNG, 
> image-2019-11-13-18-34-00-067.png, image-2019-11-13-18-36-29-063.png, 
> image-2019-11-13-18-38-18-899.png
>
>
> EC file blockId location info displaying as "null" with hdfs fsck -blockId 
> command
>  * Check the blockId information of an EC enabled file with "hdfs fsck 
> --blockId"-  Check the blockId information of an EC enabled file with "hdfs 
> fsck -blockId"    blockId location related info will display as null,which 
> needs to be rectified.    
>              Check the attachment "EC_file_block_info"
> ===
> !image-2019-11-13-18-34-00-067.png!
>  
>  *  Check the output of a normal file block to compare
>                   !image-2019-11-13-18-36-29-063.png!       
> ===
>              !image-2019-11-13-18-38-18-899.png!
>  * Actual Output :-     null   
>  * Expected output :- It should display the blockId location related info as 
> (nodes, racks) of the block  as specified in the usage info of fsck -blockId 
> option.                                    [like : Block replica on 
> datanode/rack: BLR1xx038/default-rack is HEALTHY]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2019-11-13 Thread Aiphago (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973916#comment-16973916
 ] 

Aiphago commented on HDFS-14986:


I means I can fix the dead lock problem with add dataset lock,my first comment 
have say the solution.

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Reporter: Ryan Wu
>Assignee: Ryan Wu
>Priority: Major
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2019-11-13 Thread Lisheng Sun (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973910#comment-16973910
 ] 

Lisheng Sun commented on HDFS-14986:


[~Aiphag0]

if add dataset lock at FsDatasetImpl#deepCopyReplica in trunk,  it will happen 
dead lock.

I don't use FsDatasetImpl#datasetLock , Because FsDatasetImpl#addBlockPool with 
datasetLock call FsDatasetImpl#deepCopyReplica in another Thread when dn start.

https://issues.apache.org/jira/browse/HDFS-14313?focusedCommentId=16887859=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16887859

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Reporter: Ryan Wu
>Assignee: Ryan Wu
>Priority: Major
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2019-11-13 Thread Aiphago (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973906#comment-16973906
 ] 

Aiphago commented on HDFS-14986:


We use the branch before 2.8 with the synchronized,and we fix the deadlock 
problem.And I think it's better to and add dataset lock at 
FsDatasetImpl#deepCopyReplica in trunk,and the method to slove deadlock problem 
is the same.

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Reporter: Ryan Wu
>Assignee: Ryan Wu
>Priority: Major
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1847) Datanode Kerberos principal and keytab config key looks inconsistent

2019-11-13 Thread Chris Teoh (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973897#comment-16973897
 ] 

Chris Teoh commented on HDDS-1847:
--

Thanks very much for your help [~aengineer]. Really appreciate it! :)(y)

> Datanode Kerberos principal and keytab config key looks inconsistent
> 
>
> Key: HDDS-1847
> URL: https://issues.apache.org/jira/browse/HDDS-1847
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Eric Yang
>Assignee: Chris Teoh
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Ozone Kerberos configuration can be very confusing:
> | config name | Description |
> | hdds.scm.kerberos.principal | SCM service principal |
> | hdds.scm.kerberos.keytab.file | SCM service keytab file |
> | ozone.om.kerberos.principal | Ozone Manager service principal |
> | ozone.om.kerberos.keytab.file | Ozone Manager keytab file |
> | hdds.scm.http.kerberos.principal | SCM service spnego principal |
> | hdds.scm.http.kerberos.keytab.file | SCM service spnego keytab file |
> | ozone.om.http.kerberos.principal | Ozone Manager spnego principal |
> | ozone.om.http.kerberos.keytab.file | Ozone Manager spnego keytab file |
> | hdds.datanode.http.kerberos.keytab | Datanode spnego keytab file |
> | hdds.datanode.http.kerberos.principal | Datanode spnego principal |
> | dfs.datanode.kerberos.principal | Datanode service principal |
> | dfs.datanode.keytab.file | Datanode service keytab file |
> The prefix are very different for each of the datanode configuration.  It 
> would be nice to have some consistency for datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly

2019-11-13 Thread Konstantin Shvachko (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973892#comment-16973892
 ] 

Konstantin Shvachko commented on HDFS-14973:


(1) Good point [~xkrogen] . I think Balancer, Mover, and SPS should all be 
limited to the same rate of getBlocks to NameNode. This is probably another 
good reason to track it inside {{NameNodeConnector}}. Would be good to reflect 
this in the description in {{hdfs-site.xml}}
(2) Sure let's leave refactoring of TestBalancer for the next time.

Nits:
# {{dispatchBlockMoves()}} does not need {{dSec}} anymore.
# Seems would be good for {{dispatchBlockMoves()}} to still log 
{{concurrentThreads}} in debug mode.

> Balancer getBlocks RPC dispersal does not function properly
> ---
>
> Key: HDFS-14973
> URL: https://issues.apache.org/jira/browse/HDFS-14973
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.9.0, 2.7.4, 2.8.2, 3.0.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14973.000.patch, HDFS-14973.001.patch, 
> HDFS-14973.002.patch, HDFS-14973.test.patch
>
>
> In HDFS-11384, a mechanism was added to make the {{getBlocks}} RPC calls 
> issued by the Balancer/Mover more dispersed, to alleviate load on the 
> NameNode, since {{getBlocks}} can be very expensive and the Balancer should 
> not impact normal cluster operation.
> Unfortunately, this functionality does not function as expected, especially 
> when the dispatcher thread count is low. The primary issue is that the delay 
> is applied only to the first N threads that are submitted to the dispatcher's 
> executor, where N is the size of the dispatcher's threadpool, but *not* to 
> the first R threads, where R is the number of allowed {{getBlocks}} QPS 
> (currently hardcoded to 20). For example, if the threadpool size is 100 (the 
> default), threads 0-19 have no delay, 20-99 have increased levels of delay, 
> and 100+ have no delay. As I understand it, the intent of the logic was that 
> the delay applied to the first 100 threads would force the dispatcher 
> executor's threads to all be consumed, thus blocking subsequent (non-delayed) 
> threads until the delay period has expired. However, threads 0-19 can finish 
> very quickly (their work can often be fulfilled in the time it takes to 
> execute a single {{getBlocks}} RPC, on the order of tens of milliseconds), 
> thus opening up 20 new slots in the executor, which are then consumed by 
> non-delayed threads 100-119, and so on. So, although 80 threads have had a 
> delay applied, the non-delay threads rush through in the 20 non-delay slots.
> This problem gets even worse when the dispatcher threadpool size is less than 
> the max {{getBlocks}} QPS. For example, if the threadpool size is 10, _no 
> threads ever have a delay applied_, and the feature is not enabled at all.
> This problem wasn't surfaced in the original JIRA because the test 
> incorrectly measured the period across which {{getBlocks}} RPCs were 
> distributed. The variables {{startGetBlocksTime}} and {{endGetBlocksTime}} 
> were used to track the time over which the {{getBlocks}} calls were made. 
> However, {{startGetBlocksTime}} was initialized at the time of creation of 
> the {{FSNameystem}} spy, which is before the mock DataNodes are started. Even 
> worse, the Balancer in this test takes 2 iterations to complete balancing the 
> cluster, so the time period {{endGetBlocksTime - startGetBlocksTime}} 
> actually represents:
> {code}
> (time to submit getBlocks RPCs) + (DataNode startup time) + (time for the 
> Dispatcher to complete an iteration of moving blocks)
> {code}
> Thus, the RPC QPS reported by the test is much lower than the RPC QPS seen 
> during the period of initial block fetching.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-2383) Closing open container via SCMCli throws exception

2019-11-13 Thread Nanda kumar (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar resolved HDDS-2383.
---
Resolution: Duplicate

> Closing open container via SCMCli throws exception
> --
>
> Key: HDDS-2383
> URL: https://issues.apache.org/jira/browse/HDDS-2383
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Rajesh Balamohan
>Assignee: Nanda kumar
>Priority: Major
>
> This was observed in apache master branch.
> Closing the container via {{SCMCli}} throws the following exception, though 
> the container ends up getting closed eventually.
> {noformat}
> 2019-10-30 02:44:41,794 INFO 
> org.apache.hadoop.hdds.scm.block.SCMBlockDeletingService: Block deletion 
> txnID mismatch in datanode 79626ba3-1957-46e5-a8b0-32d7f47fb801 for 
> containerID 6. Datanode delete txnID: 0, SCM txnID: 1004
> 2019-10-30 02:44:41,810 INFO 
> org.apache.hadoop.hdds.scm.container.IncrementalContainerReportHandler: 
> Moving container #4 to CLOSED state, datanode 
> 8885d4ba-228a-4fd2-bf5a-831f01594c6c{ip: 10.17.234.37, host: 
> vd1327.halxg.cloudera.com, networkLocation: /default-rack, certSerialId: 
> null} reported CLOSED replica.
> 2019-10-30 02:44:41,826 INFO 
> org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer: Object type 
> container id 4 op close new stage complete
> 2019-10-30 02:44:41,826 ERROR 
> org.apache.hadoop.hdds.scm.container.ContainerStateManager: Failed to update 
> container state #4, reason: invalid state transition from state: CLOSED upon 
> event: CLOSE.
> 2019-10-30 02:44:41,826 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 6 on 9860, call Call#3 Retry#0 
> org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocol.submitRequest
>  from 10.17.234.32:45926
> org.apache.hadoop.hdds.scm.exceptions.SCMException: Failed to update 
> container state #4, reason: invalid state transition from state: CLOSED upon 
> event: CLOSE.
> at 
> org.apache.hadoop.hdds.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:338)
> at 
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerState(SCMContainerManager.java:326)
> at 
> org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.notifyObjectStageChange(SCMClientProtocolServer.java:388)
> at 
> org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.notifyObjectStageChange(StorageContainerLocationProtocolServerSideTranslatorPB.java:303)
> at 
> org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.processRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:158)
> at 
> org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB$$Lambda$152/2036820231.apply(Unknown
>  Source)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.submitRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:112)
> at 
> org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:30454)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14988) HDFS should avoid read/write data from slow disks.

2019-11-13 Thread yimeng (Jira)

yimeng created HDFS-14988:
-

 Summary: HDFS should avoid read/write data from slow disks.
 Key: HDFS-14988
 URL: https://issues.apache.org/jira/browse/HDFS-14988
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: block placement, datanode
Affects Versions: 3.2.1, 3.1.1
Reporter: yimeng


Slow disk causes real-time service(such as HBase ) become slowdown.

The slow disk detection is added to the HDFS-11461, but only the slow disk is 
recorded in the Metric. I hope to further handle the detected slow disk. In my 
view, the slow disk can be added to the read data policy. If the block is on 
the slow disk of the DataNode, the block of other DataNodes is selected. For 
write data, slow disks can be added to the data write policy. We can remove the 
slow disk from all disks and then select a disk to write data based on 
dfs.datanode.fsdataset.volume.choosing.policy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-11-13 Thread Jira



[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973862#comment-16973862
 ] 

Íñigo Goiri commented on HDFS-14983:


[~aajisaka], true, that's missing.

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Priority: Minor
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1940) Closing open container via scmcli gives false error message

2019-11-13 Thread Anu Engineer (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-1940:
---
Fix Version/s: 0.5.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

[~adoroszlai] Thanks for review and testing the patch. [~nanda] Thanks for the 
contribution. I have committed this patch to the master branch.

> Closing open container via scmcli gives false error message
> ---
>
> Key: HDDS-1940
> URL: https://issues.apache.org/jira/browse/HDDS-1940
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Attila Doroszlai
>Assignee: Nanda kumar
>Priority: Minor
>  Labels: incompatibleChange, pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{scmcli close}} prints an error message about invalid state transition after 
> it had successfully closed the container.
> {code:title=CLI}
> $ ozone scmcli info 2
> ...
> Container State: OPEN
> ...
> $ ozone scmcli close 2
> ...
> client-09830A377AA9->f27bf787-8711-41d4-b0fd-3ef50b5c076f: receive 
> RaftClientReply:client-09830A377AA9->f27bf787-8711-41d4-b0fd-3ef50b5c076f@group-7831D6F2EF1B,
>  cid=0, SUCCESS, logIndex=11, 
> commits[f27bf787-8711-41d4-b0fd-3ef50b5c076f:c12, 
> 37ba33fe-c9ed-4ac2-a6e5-57ce658168b4:c11, 
> feb68ba4-0a8a-4eda-9915-7dc090e5f46c:c11]
> Failed to update container state #2, reason: invalid state transition from 
> state: CLOSED upon event: CLOSE.
> $ ozone scmcli info 2
> ...
> Container State: CLOSED
> ...
> {code}
> {code:title=logs}
> scm_1  | 2019-08-09 15:15:01 [IPC Server handler 1 on 9860] INFO  
> SCMClientProtocolServer:366 - Object type container id 1 op close new stage 
> begin
> dn3_1  | 2019-08-09 15:15:02 [RatisApplyTransactionExecutor 1] INFO  
> Container:356 - Container 1 is closed with bcsId 3.
> dn1_1  | 2019-08-09 15:15:02 [RatisApplyTransactionExecutor 1] INFO  
> Container:356 - Container 1 is closed with bcsId 3.
> scm_1  | 2019-08-09 15:15:02 
> [EventQueue-IncrementalContainerReportForIncrementalContainerReportHandler] 
> INFO  IncrementalContainerReportHandler:176 - Moving container #1 to CLOSED 
> state, datanode feb68ba4-0a8a-4eda-9915-7dc090e5f46c{ip: 10.5.1.6, host: 
> ozone-static_dn3_1.ozone-static_net, networkLocation: /default-rack, 
> certSerialId: null} reported CLOSED replica.
> dn2_1  | 2019-08-09 15:15:02 [RatisApplyTransactionExecutor 1] INFO  
> Container:356 - Container 1 is closed with bcsId 3.
> scm_1  | 2019-08-09 15:15:02 [IPC Server handler 3 on 9860] INFO  
> SCMClientProtocolServer:366 - Object type container id 1 op close new stage 
> complete
> scm_1  | 2019-08-09 15:15:02 [IPC Server handler 3 on 9860] ERROR 
> ContainerStateManager:335 - Failed to update container state #1, reason: 
> invalid state transition from state: CLOSED upon event: CLOSE.
> scm_1  | 2019-08-09 15:15:02 [IPC Server handler 3 on 9860] INFO  Server:2726 
> - IPC Server handler 3 on 9860, call Call#3 Retry#0 
> org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocol.notifyObjectStageChange
>  from 10.5.0.71:57746
> scm_1  | org.apache.hadoop.hdds.scm.exceptions.SCMException: Failed to update 
> container state #1, reason: invalid state transition from state: CLOSED upon 
> event: CLOSE.
> scm_1  |  at 
> org.apache.hadoop.hdds.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:336)
> scm_1  |  at 
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerState(SCMContainerManager.java:312)
> scm_1  |  at 
> org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.notifyObjectStageChange(SCMClientProtocolServer.java:379)
> scm_1  |  at 
> org.apache.hadoop.ozone.protocolPB.StorageContainerLocationProtocolServerSideTranslatorPB.notifyObjectStageChange(StorageContainerLocationProtocolServerSideTranslatorPB.java:219)
> scm_1  |  at 
> org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:16398)
> scm_1  |  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> scm_1  |  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> scm_1  |  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> scm_1  |  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> scm_1  |  at java.base/java.security.AccessController.doPrivileged(Native 
> Method)
> scm_1  |  at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
> scm_1  |  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> scm_1  |  at

[jira] [Work logged] (HDDS-1940) Closing open container via scmcli gives false error message

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-1940?focusedWorklogId=343059=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343059
 ]

ASF GitHub Bot logged work on HDDS-1940:


Author: ASF GitHub Bot
Created on: 14/Nov/19 01:26
Start Date: 14/Nov/19 01:26
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #153: HDDS-1940. 
Closing open container via scmcli gives false error message.
URL: https://github.com/apache/hadoop-ozone/pull/153
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 343059)
Time Spent: 20m  (was: 10m)

> Closing open container via scmcli gives false error message
> ---
>
> Key: HDDS-1940
> URL: https://issues.apache.org/jira/browse/HDDS-1940
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Attila Doroszlai
>Assignee: Nanda kumar
>Priority: Minor
>  Labels: incompatibleChange, pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{scmcli close}} prints an error message about invalid state transition after 
> it had successfully closed the container.
> {code:title=CLI}
> $ ozone scmcli info 2
> ...
> Container State: OPEN
> ...
> $ ozone scmcli close 2
> ...
> client-09830A377AA9->f27bf787-8711-41d4-b0fd-3ef50b5c076f: receive 
> RaftClientReply:client-09830A377AA9->f27bf787-8711-41d4-b0fd-3ef50b5c076f@group-7831D6F2EF1B,
>  cid=0, SUCCESS, logIndex=11, 
> commits[f27bf787-8711-41d4-b0fd-3ef50b5c076f:c12, 
> 37ba33fe-c9ed-4ac2-a6e5-57ce658168b4:c11, 
> feb68ba4-0a8a-4eda-9915-7dc090e5f46c:c11]
> Failed to update container state #2, reason: invalid state transition from 
> state: CLOSED upon event: CLOSE.
> $ ozone scmcli info 2
> ...
> Container State: CLOSED
> ...
> {code}
> {code:title=logs}
> scm_1  | 2019-08-09 15:15:01 [IPC Server handler 1 on 9860] INFO  
> SCMClientProtocolServer:366 - Object type container id 1 op close new stage 
> begin
> dn3_1  | 2019-08-09 15:15:02 [RatisApplyTransactionExecutor 1] INFO  
> Container:356 - Container 1 is closed with bcsId 3.
> dn1_1  | 2019-08-09 15:15:02 [RatisApplyTransactionExecutor 1] INFO  
> Container:356 - Container 1 is closed with bcsId 3.
> scm_1  | 2019-08-09 15:15:02 
> [EventQueue-IncrementalContainerReportForIncrementalContainerReportHandler] 
> INFO  IncrementalContainerReportHandler:176 - Moving container #1 to CLOSED 
> state, datanode feb68ba4-0a8a-4eda-9915-7dc090e5f46c{ip: 10.5.1.6, host: 
> ozone-static_dn3_1.ozone-static_net, networkLocation: /default-rack, 
> certSerialId: null} reported CLOSED replica.
> dn2_1  | 2019-08-09 15:15:02 [RatisApplyTransactionExecutor 1] INFO  
> Container:356 - Container 1 is closed with bcsId 3.
> scm_1  | 2019-08-09 15:15:02 [IPC Server handler 3 on 9860] INFO  
> SCMClientProtocolServer:366 - Object type container id 1 op close new stage 
> complete
> scm_1  | 2019-08-09 15:15:02 [IPC Server handler 3 on 9860] ERROR 
> ContainerStateManager:335 - Failed to update container state #1, reason: 
> invalid state transition from state: CLOSED upon event: CLOSE.
> scm_1  | 2019-08-09 15:15:02 [IPC Server handler 3 on 9860] INFO  Server:2726 
> - IPC Server handler 3 on 9860, call Call#3 Retry#0 
> org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocol.notifyObjectStageChange
>  from 10.5.0.71:57746
> scm_1  | org.apache.hadoop.hdds.scm.exceptions.SCMException: Failed to update 
> container state #1, reason: invalid state transition from state: CLOSED upon 
> event: CLOSE.
> scm_1  |  at 
> org.apache.hadoop.hdds.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:336)
> scm_1  |  at 
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerState(SCMContainerManager.java:312)
> scm_1  |  at 
> org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.notifyObjectStageChange(SCMClientProtocolServer.java:379)
> scm_1  |  at 
> org.apache.hadoop.ozone.protocolPB.StorageContainerLocationProtocolServerSideTranslatorPB.notifyObjectStageChange(StorageContainerLocationProtocolServerSideTranslatorPB.java:219)
> scm_1  |  at 
> org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:16398)
> scm_1  |  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)

[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-11-13 Thread Akira Ajisaka (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973855#comment-16973855
 ] 

Akira Ajisaka commented on HDFS-14983:
--

HDFS-14545 is to refresh the configs of the NameNodes via DFSRouter. I'd like 
to refresh the configs of the DFSRouter itself.

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Priority: Minor
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2308) Switch to centos with the apache/ozone-build docker image

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2308?focusedWorklogId=343040=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343040
 ]

ASF GitHub Bot logged work on HDDS-2308:


Author: ASF GitHub Bot
Created on: 14/Nov/19 00:57
Start Date: 14/Nov/19 00:57
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #9: HDDS-2308. 
Switch to centos with the apache/ozone-build docker image
URL: https://github.com/apache/hadoop-docker-ozone/pull/9
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 343040)
Time Spent: 20m  (was: 10m)

> Switch to centos with the apache/ozone-build docker image
> -
>
> Key: HDDS-2308
> URL: https://issues.apache.org/jira/browse/HDDS-2308
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
> Attachments: hs_err_pid16346.log
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I realized multiple JVM crashes in the daily builds:
>  
> {code:java}
> ERROR] ExecutionException The forked VM terminated without properly saying 
> goodbye. VM crash or System.exit called?
>   
>   
> [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && 
> /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m 
> -XX:+HeapDumpOnOutOfMemoryError -jar 
> /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter9018689154779946208.jar
>  /workdir/hadoop-ozone/ozonefs/target/surefire 
> 2019-10-06T14-52-40_697-jvmRun1 surefire7569723928289175829tmp 
> surefire_947955725320624341206tmp
>   
>   
> [ERROR] Error occurred in starting fork, check output in log
>   
>   
> [ERROR] Process Exit Code: 139
>   
>   
> [ERROR] Crashed tests:
>   
>   
> [ERROR] org.apache.hadoop.fs.ozone.contract.ITestOzoneContractRename
>   
>   
> [ERROR] ExecutionException The forked VM terminated without properly 
> saying goodbye. VM crash or System.exit called?
>   
>   
> [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && 
> /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m 
> -XX:+HeapDumpOnOutOfMemoryError -jar 
> /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter5429192218879128313.jar
>  /workdir/hadoop-ozone/ozonefs/target/surefire 
> 2019-10-06T14-52-40_697-jvmRun1 surefire7227403571189445391tmp 
> surefire_1011197392458143645283tmp
>   
>   
> [ERROR] Error occurred in starting fork, check output in log
>   
>   
> [ERROR] Process Exit Code: 139
>   
>   
> [ERROR] Crashed tests:
>   
>   
> [ERROR] org.apache.hadoop.fs.ozone.contract.ITestOzoneContractDistCp
>   
>   
> [ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException The forked VM terminated without properly saying goodbye. 
> VM crash or System.exit called?
>   
>   
> [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && 
> /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m 
> -XX:+HeapDumpOnOutOfMemoryError -jar 
> /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter1355604543311368443.jar
>  /workdir/hadoop-ozone/ozonefs/target/surefire 
> 2019-10-06T14-52-40_697-jvmRun1 surefire3938612864214747736tmp 
> surefire_933162535733309260236tmp
>   
>   
> [ERROR] Error occurred in starting fork, check output in log
>   
>   
> [ERROR] Process Exit Code: 139
>   
>   
> [ERROR] ExecutionException The forked VM terminated without properly 
> saying goodbye. VM crash or System.exit called?
>   
>   
> [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && 
> /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m 
> -XX:+HeapDumpOnOutOfMemoryError -jar 
> /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter9018689154779946208.jar
>  /workdir/hadoop-ozone/ozonefs/target/surefire 
> 2019-10-06T14-52-40_697-jvmRun1 surefire7569723928289175829tmp 
> surefire_947955725320624341206tmp
>   
>   
> [ERROR] Error occurred in starting fork, check output in log
>   
>   
> [ERROR] Process Exit Code: 139 {code}
>  
> Based on the crash log (uploaded) it's related to the

[jira] [Resolved] (HDDS-2308) Switch to centos with the apache/ozone-build docker image

2019-11-13 Thread Anu Engineer (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HDDS-2308.

Fix Version/s: 0.5.0
   Resolution: Fixed

Committed to the build branch.

> Switch to centos with the apache/ozone-build docker image
> -
>
> Key: HDDS-2308
> URL: https://issues.apache.org/jira/browse/HDDS-2308
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
> Attachments: hs_err_pid16346.log
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I realized multiple JVM crashes in the daily builds:
>  
> {code:java}
> ERROR] ExecutionException The forked VM terminated without properly saying 
> goodbye. VM crash or System.exit called?
>   
>   
> [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && 
> /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m 
> -XX:+HeapDumpOnOutOfMemoryError -jar 
> /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter9018689154779946208.jar
>  /workdir/hadoop-ozone/ozonefs/target/surefire 
> 2019-10-06T14-52-40_697-jvmRun1 surefire7569723928289175829tmp 
> surefire_947955725320624341206tmp
>   
>   
> [ERROR] Error occurred in starting fork, check output in log
>   
>   
> [ERROR] Process Exit Code: 139
>   
>   
> [ERROR] Crashed tests:
>   
>   
> [ERROR] org.apache.hadoop.fs.ozone.contract.ITestOzoneContractRename
>   
>   
> [ERROR] ExecutionException The forked VM terminated without properly 
> saying goodbye. VM crash or System.exit called?
>   
>   
> [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && 
> /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m 
> -XX:+HeapDumpOnOutOfMemoryError -jar 
> /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter5429192218879128313.jar
>  /workdir/hadoop-ozone/ozonefs/target/surefire 
> 2019-10-06T14-52-40_697-jvmRun1 surefire7227403571189445391tmp 
> surefire_1011197392458143645283tmp
>   
>   
> [ERROR] Error occurred in starting fork, check output in log
>   
>   
> [ERROR] Process Exit Code: 139
>   
>   
> [ERROR] Crashed tests:
>   
>   
> [ERROR] org.apache.hadoop.fs.ozone.contract.ITestOzoneContractDistCp
>   
>   
> [ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException The forked VM terminated without properly saying goodbye. 
> VM crash or System.exit called?
>   
>   
> [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && 
> /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m 
> -XX:+HeapDumpOnOutOfMemoryError -jar 
> /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter1355604543311368443.jar
>  /workdir/hadoop-ozone/ozonefs/target/surefire 
> 2019-10-06T14-52-40_697-jvmRun1 surefire3938612864214747736tmp 
> surefire_933162535733309260236tmp
>   
>   
> [ERROR] Error occurred in starting fork, check output in log
>   
>   
> [ERROR] Process Exit Code: 139
>   
>   
> [ERROR] ExecutionException The forked VM terminated without properly 
> saying goodbye. VM crash or System.exit called?
>   
>   
> [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && 
> /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m 
> -XX:+HeapDumpOnOutOfMemoryError -jar 
> /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter9018689154779946208.jar
>  /workdir/hadoop-ozone/ozonefs/target/surefire 
> 2019-10-06T14-52-40_697-jvmRun1 surefire7569723928289175829tmp 
> surefire_947955725320624341206tmp
>   
>   
> [ERROR] Error occurred in starting fork, check output in log
>   
>   
> [ERROR] Process Exit Code: 139 {code}
>  
> Based on the crash log (uploaded) it's related to the rocksdb JNI interface.
> In the current ozone-build docker image (which provides the environment for 
> build) we use alpine where musl libc is used instead of the main glibc. I 
> think it would be more safe to use the same glibc what is used in production.
> I tested with centos based docker image and it seems to be more stable. 
> Didn't see any more JVM crashes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly

2019-11-13 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973838#comment-16973838
 ] 

Hadoop QA commented on HDFS-14973:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
56s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 50s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 825 unchanged - 1 fixed = 826 total (was 826) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 55s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14973 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12985792/HDFS-14973.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux c687d780da15 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 73a386a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28309/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit |

[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-13 Thread Bharat Viswanadham (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973836#comment-16973836
 ] 

Bharat Viswanadham commented on HDDS-2356:
--

Hi [~timmylicheng]

Thanks for sharing the logs.

I see completeMultipartUpload is called with 286 parts, and OM is throwing an 
error InvalidPart, but from an audit log, I was not able to know which part is 
missing in OM. (And I see 286 success commit Multipart upload for the key). 



I think there might be a chance of the scenario HDDS-2477 we are hitting here. 
(Not completely sure, this is my analysis after looking up logs) I have opened 
couple of Jira's HDDS-2477 HDDS-2471 and HDDS-2470 which will help in 
analyzing/debugging this issue. (Let's see HDDS-2477 will fix it or not)

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2019-11-06_18_13_57_422_ERROR, hs_err_pid9340.log, 
> image-2019-10-31-18-56-56-177.png, om-audit-VM_50_210_centos.log, 
> om_audit_log_plc_1570863541668_9278.txt
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
>

[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-13 Thread Bharat Viswanadham (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973836#comment-16973836
 ] 

Bharat Viswanadham edited comment on HDDS-2356 at 11/14/19 12:41 AM:
-

Hi [~timmylicheng]

Thanks for sharing the logs.

I see completeMultipartUpload is called with 286 parts, and OM is throwing an 
error InvalidPart, but from an audit log, I was not able to know which part is 
missing in OM(because we don't print any such info in log/exception message). 
(And I see 286 success commit Multipart upload for the key). 

I think there might be a chance of the scenario HDDS-2477 we are hitting here. 
(Not completely sure, this is my analysis after looking up logs) I have opened 
couple of Jira's HDDS-2477 HDDS-2471 and HDDS-2470 which will help in 
analyzing/debugging this issue. (Let's see HDDS-2477 will fix it or not)


was (Author: bharatviswa):
Hi [~timmylicheng]

Thanks for sharing the logs.

I see completeMultipartUpload is called with 286 parts, and OM is throwing an 
error InvalidPart, but from an audit log, I was not able to know which part is 
missing in OM. (And I see 286 success commit Multipart upload for the key). 



I think there might be a chance of the scenario HDDS-2477 we are hitting here. 
(Not completely sure, this is my analysis after looking up logs) I have opened 
couple of Jira's HDDS-2477 HDDS-2471 and HDDS-2470 which will help in 
analyzing/debugging this issue. (Let's see HDDS-2477 will fix it or not)

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2019-11-06_18_13_57_422_ERROR, hs_err_pid9340.log, 
> image-2019-10-31-18-56-56-177.png, om-audit-VM_50_210_centos.log, 
> om_audit_log_plc_1570863541668_9278.txt
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
>

[jira] [Updated] (HDDS-2477) TableCache cleanup issue for OM non-HA

2019-11-13 Thread Bharat Viswanadham (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2477:
-
Component/s: Ozone Manager

> TableCache cleanup issue for OM non-HA
> --
>
> Key: HDDS-2477
> URL: https://issues.apache.org/jira/browse/HDDS-2477
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In OM in non-HA case, the ratisTransactionLogIndex is generated by 
> OmProtocolServersideTranslatorPB.java. And in OM non-HA 
> validateAndUpdateCache is called from multipleHandler threads. So think of a 
> case where one thread which has an index - 10 has added to doubleBuffer. (0-9 
> still have not added). DoubleBuffer flush thread flushes and call cleanup. 
> (So, now cleanup will go and cleanup all cache entries with less than 10 
> epoch) This should not have cleanup those which might have put in to cache 
> later and which are in process of flush to DB. This will cause inconsitency 
> for few OM requests.
>  
>  
> Example:
> 4 threads Committing 4 parts.
> 1st thread - part 1 - ratis Index - 3
> 2nd thread - part 2 - ratis index - 2
> 3rd thread - part3 - ratis index - 1
>  
> First thread got lock, and put in to doubleBuffer and cache with 
> OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in 
> cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 
> parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread 
> might cleanup those entries, as it is called with index 3 for cleanup.
>  
> Now when the 4th part upload came -> when it is commit Multipart upload when 
> it gets multipartinfo it get Only part1 in OmMultipartInfo, as the 
> OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now 
> after 4th part upload is complete in DB and Cache we will have 1,4 parts 
> only. We will miss part2,3 information.
>  
> So for non-HA case cleanup will be called with list of epochs that need to be 
> cleanedup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2477) TableCache cleanup issue for OM non-HA

2019-11-13 Thread Bharat Viswanadham (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2477:
-
Status: Patch Available  (was: Open)

> TableCache cleanup issue for OM non-HA
> --
>
> Key: HDDS-2477
> URL: https://issues.apache.org/jira/browse/HDDS-2477
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In OM in non-HA case, the ratisTransactionLogIndex is generated by 
> OmProtocolServersideTranslatorPB.java. And in OM non-HA 
> validateAndUpdateCache is called from multipleHandler threads. So think of a 
> case where one thread which has an index - 10 has added to doubleBuffer. (0-9 
> still have not added). DoubleBuffer flush thread flushes and call cleanup. 
> (So, now cleanup will go and cleanup all cache entries with less than 10 
> epoch) This should not have cleanup those which might have put in to cache 
> later and which are in process of flush to DB. This will cause inconsitency 
> for few OM requests.
>  
>  
> Example:
> 4 threads Committing 4 parts.
> 1st thread - part 1 - ratis Index - 3
> 2nd thread - part 2 - ratis index - 2
> 3rd thread - part3 - ratis index - 1
>  
> First thread got lock, and put in to doubleBuffer and cache with 
> OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in 
> cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 
> parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread 
> might cleanup those entries, as it is called with index 3 for cleanup.
>  
> Now when the 4th part upload came -> when it is commit Multipart upload when 
> it gets multipartinfo it get Only part1 in OmMultipartInfo, as the 
> OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now 
> after 4th part upload is complete in DB and Cache we will have 1,4 parts 
> only. We will miss part2,3 information.
>  
> So for non-HA case cleanup will be called with list of epochs that need to be 
> cleanedup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2477) TableCache cleanup issue for OM non-HA

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2477?focusedWorklogId=343028=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343028
 ]

ASF GitHub Bot logged work on HDDS-2477:


Author: ASF GitHub Bot
Created on: 14/Nov/19 00:24
Start Date: 14/Nov/19 00:24
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #159: 
HDDS-2477. TableCache cleanup issue for OM non-HA.:
URL: https://github.com/apache/hadoop-ozone/pull/159
 
 
   ## What changes were proposed in this pull request?
   
   In OM in non-HA case, the ratisTransactionLogIndex is generated by 
OmProtocolServersideTranslatorPB.java. And in OM non-HA validateAndUpdateCache 
is called from multipleHandler threads. So think of a case where one thread 
which has an index - 10 has added to doubleBuffer. (0-9 still have not added). 
DoubleBuffer flush thread flushes and call cleanup. (So, now cleanup will go 
and cleanup all cache entries with less than 10 epoch) This should not have 
cleanup those which might have put in to cache later and which are in process 
of flush to DB. This will cause inconsitency for few OM requests.
   
   Example:
   
   4 threads Committing 4 parts.
   
   1st thread - part 1 - ratis Index - 3
   
   2nd thread - part 2 - ratis index - 2
   
   3rd thread - part3 - ratis index - 1
   

   
   First thread got lock, and put in to doubleBuffer and cache with 
OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in 
cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 
parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread 
might cleanup those entries, as it is called with index 3 for cleanup.
   

   
   Now when the 4th part upload came -> when it is commit Multipart upload when 
it gets multipartinfo it get Only part1 in OmMultipartInfo, as the 
OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now 
after 4th part upload is complete in DB and Cache we will have 1,4 parts only. 
We will miss part2,3 information.
   

   
   So for non-HA case cleanup will be called with list of epochs that need to 
be cleanedup.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2477
   
   ## How was this patch tested?
   
   Added UT.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 343028)
Remaining Estimate: 0h
Time Spent: 10m

> TableCache cleanup issue for OM non-HA
> --
>
> Key: HDDS-2477
> URL: https://issues.apache.org/jira/browse/HDDS-2477
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In OM in non-HA case, the ratisTransactionLogIndex is generated by 
> OmProtocolServersideTranslatorPB.java. And in OM non-HA 
> validateAndUpdateCache is called from multipleHandler threads. So think of a 
> case where one thread which has an index - 10 has added to doubleBuffer. (0-9 
> still have not added). DoubleBuffer flush thread flushes and call cleanup. 
> (So, now cleanup will go and cleanup all cache entries with less than 10 
> epoch) This should not have cleanup those which might have put in to cache 
> later and which are in process of flush to DB. This will cause inconsitency 
> for few OM requests.
>  
>  
> Example:
> 4 threads Committing 4 parts.
> 1st thread - part 1 - ratis Index - 3
> 2nd thread - part 2 - ratis index - 2
> 3rd thread - part3 - ratis index - 1
>  
> First thread got lock, and put in to doubleBuffer and cache with 
> OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in 
> cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 
> parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread 
> might cleanup those entries, as it is called with index 3 for cleanup.
>  
> Now when the 4th part upload came -> when it is commit Multipart upload when 
> it gets multipartinfo it get Only part1 in OmMultipartInfo, as the 
> OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now 
> after 4th part upload is complete in DB and Cache we will have 1,4 parts 
> only. We will miss part2,3 information.
>  
> So for non-HA case cleanup will be called with list of epochs that need to be 
>

[jira] [Updated] (HDDS-2477) TableCache cleanup issue for OM non-HA

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2477:
-
Labels: pull-request-available  (was: )

> TableCache cleanup issue for OM non-HA
> --
>
> Key: HDDS-2477
> URL: https://issues.apache.org/jira/browse/HDDS-2477
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>
> In OM in non-HA case, the ratisTransactionLogIndex is generated by 
> OmProtocolServersideTranslatorPB.java. And in OM non-HA 
> validateAndUpdateCache is called from multipleHandler threads. So think of a 
> case where one thread which has an index - 10 has added to doubleBuffer. (0-9 
> still have not added). DoubleBuffer flush thread flushes and call cleanup. 
> (So, now cleanup will go and cleanup all cache entries with less than 10 
> epoch) This should not have cleanup those which might have put in to cache 
> later and which are in process of flush to DB. This will cause inconsitency 
> for few OM requests.
>  
>  
> Example:
> 4 threads Committing 4 parts.
> 1st thread - part 1 - ratis Index - 3
> 2nd thread - part 2 - ratis index - 2
> 3rd thread - part3 - ratis index - 1
>  
> First thread got lock, and put in to doubleBuffer and cache with 
> OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in 
> cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 
> parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread 
> might cleanup those entries, as it is called with index 3 for cleanup.
>  
> Now when the 4th part upload came -> when it is commit Multipart upload when 
> it gets multipartinfo it get Only part1 in OmMultipartInfo, as the 
> OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now 
> after 4th part upload is complete in DB and Cache we will have 1,4 parts 
> only. We will miss part2,3 information.
>  
> So for non-HA case cleanup will be called with list of epochs that need to be 
> cleanedup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2477) TableCache cleanup issue for OM non-HA

2019-11-13 Thread Bharat Viswanadham (Jira)

Bharat Viswanadham created HDDS-2477:


 Summary: TableCache cleanup issue for OM non-HA
 Key: HDDS-2477
 URL: https://issues.apache.org/jira/browse/HDDS-2477
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


In OM in non-HA case, the ratisTransactionLogIndex is generated by 
OmProtocolServersideTranslatorPB.java. And in OM non-HA validateAndUpdateCache 
is called from multipleHandler threads. So think of a case where one thread 
which has an index - 10 has added to doubleBuffer. (0-9 still have not added). 
DoubleBuffer flush thread flushes and call cleanup. (So, now cleanup will go 
and cleanup all cache entries with less than 10 epoch) This should not have 
cleanup those which might have put in to cache later and which are in process 
of flush to DB. This will cause inconsitency for few OM requests.

 

 

Example:

4 threads Committing 4 parts.

1st thread - part 1 - ratis Index - 3

2nd thread - part 2 - ratis index - 2

3rd thread - part3 - ratis index - 1

 

First thread got lock, and put in to doubleBuffer and cache with 
OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in 
cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 
parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread 
might cleanup those entries, as it is called with index 3 for cleanup.

 

Now when the 4th part upload came -> when it is commit Multipart upload when it 
gets multipartinfo it get Only part1 in OmMultipartInfo, as the OmMultipartInfo 
(with 1,2,3 is still in process of committing to DB). So now after 4th part 
upload is complete in DB and Cache we will have 1,4 parts only. We will miss 
part2,3 information.

 

So for non-HA case cleanup will be called with list of epochs that need to be 
cleanedup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2469) Avoid changing client-side key metadata

2019-11-13 Thread Dinesh Chitlangia (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Chitlangia updated HDDS-2469:

Fix Version/s: 0.5.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~adoroszlai] for the contribution, [~aengineer] for review.

> Avoid changing client-side key metadata
> ---
>
> Key: HDDS-2469
> URL: https://issues.apache.org/jira/browse/HDDS-2469
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Ozone RPC client should not change input map from client while creating keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2469) Avoid changing client-side key metadata

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2469?focusedWorklogId=343012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343012
 ]

ASF GitHub Bot logged work on HDDS-2469:


Author: ASF GitHub Bot
Created on: 13/Nov/19 23:47
Start Date: 13/Nov/19 23:47
Worklog Time Spent: 10m 
  Work Description: dineshchitlangia commented on pull request #154: 
HDDS-2469. Avoid changing client-side key metadata
URL: https://github.com/apache/hadoop-ozone/pull/154
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 343012)
Time Spent: 20m  (was: 10m)

> Avoid changing client-side key metadata
> ---
>
> Key: HDDS-2469
> URL: https://issues.apache.org/jira/browse/HDDS-2469
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Ozone RPC client should not change input map from client while creating keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1889) Add support for verifying multiline log entry

2019-11-13 Thread Prashant Pogde (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Pogde updated HDDS-1889:
-
Attachment: image.png

> Add support for verifying multiline log entry
> -
>
> Key: HDDS-1889
> URL: https://issues.apache.org/jira/browse/HDDS-1889
> Project: Hadoop Distributed Data Store
>  Issue Type: Test
>  Components: test
>Reporter: Dinesh Chitlangia
>Priority: Major
>  Labels: newbie
> Attachments: image.png
>
>
> This jira aims to test the failure scenario where a multi-line stack trace 
> will be added in the audit log. Currently, for test assumes that even in 
> failure scenario we don't have multi-line log entry.
> Example:
> {code:java}
> private static final AuditMessage READ_FAIL_MSG =
>   new AuditMessage.Builder()
>   .setUser("john")
>   .atIp("192.168.0.1")
>   .forOperation(DummyAction.READ_VOLUME.name())
>   .withParams(PARAMS)
>   .withResult(FAILURE)
>   .withException(null).build();
> {code}
> Therefore in verifyLog() we only compare for first line of the log file with 
> the expected message.
> The test would fail if in future someone were to create a scenario with 
> multi-line log entry.
> 1. Update READ_FAIL_MSG so that it has multiple lines of Exception stack 
> trace.
> This is what multi-line log entry could look like:
> {code:java}
> ERROR | OMAudit | user=dchitlangia | ip=127.0.0.1 | op=GET_ACL 
> {volume=volume80100, bucket=bucket83878, key=null, aclType=CREATE, 
> resourceType=volume, storeType=ozone} | ret=FAILURE
> org.apache.hadoop.ozone.om.exceptions.OMException: User dchitlangia doesn't 
> have CREATE permission to access volume
>  at org.apache.hadoop.ozone.om.OzoneManager.checkAcls(OzoneManager.java:1809) 
> ~[classes/:?]
>  at org.apache.hadoop.ozone.om.OzoneManager.checkAcls(OzoneManager.java:1769) 
> ~[classes/:?]
>  at 
> org.apache.hadoop.ozone.om.OzoneManager.createBucket(OzoneManager.java:2092) 
> ~[classes/:?]
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.createBucket(OzoneManagerRequestHandler.java:526)
>  ~[classes/:?]
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handle(OzoneManagerRequestHandler.java:185)
>  ~[classes/:?]
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.java:192)
>  ~[classes/:?]
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:110)
>  ~[classes/:?]
>  at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>  ~[classes/:?]
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>  ~[hadoop-common-3.2.0.jar:?]
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) 
> ~[hadoop-common-3.2.0.jar:?]
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) 
> ~[hadoop-common-3.2.0.jar:?]
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) 
> ~[hadoop-common-3.2.0.jar:?]
>  at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_144]
>  at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_144]
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ~[hadoop-common-3.2.0.jar:?]
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) 
> ~[hadoop-common-3.2.0.jar:?]
> {code}
> 2. Update verifyLog method to accept variable number of arguments.
> 3. Update the assertion so that it compares beyond the first line when the 
> expected is a multi-line log entry.
> {code:java}
> assertTrue(expected.equalsIgnoreCase(lines.get(0)));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2472?focusedWorklogId=342996=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342996
 ]

ASF GitHub Bot logged work on HDDS-2472:


Author: ASF GitHub Bot
Created on: 13/Nov/19 23:27
Start Date: 13/Nov/19 23:27
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on pull request #158: HDDS-2472. 
Use try-with-resources while creating FlushOptions in RDBS…
URL: https://github.com/apache/hadoop-ozone/pull/158
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 342996)
Time Spent: 20m  (was: 10m)

> Use try-with-resources while creating FlushOptions in RDBStore.
> ---
>
> Key: HDDS-2472
> URL: https://issues.apache.org/jira/browse/HDDS-2472
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Link to the sonar issue flag - 
> https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2472:
-
Labels: pull-request-available  (was: )

> Use try-with-resources while creating FlushOptions in RDBStore.
> ---
>
> Key: HDDS-2472
> URL: https://issues.apache.org/jira/browse/HDDS-2472
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>
> Link to the sonar issue flag - 
> https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2472?focusedWorklogId=342995=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342995
 ]

ASF GitHub Bot logged work on HDDS-2472:


Author: ASF GitHub Bot
Created on: 13/Nov/19 23:26
Start Date: 13/Nov/19 23:26
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on pull request #158: HDDS-2472. 
Use try-with-resources while creating FlushOptions in RDBS…
URL: https://github.com/apache/hadoop-ozone/pull/158
 
 
   …tore.
   
   ## What changes were proposed in this pull request?
   Use try-with-resources while creating FlushOptions in RDBStore class. Remove 
code duplication in getCheckpoint method.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-2472
   
   ## How was this patch tested?
   Unit tested.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 342995)
Remaining Estimate: 0h
Time Spent: 10m

> Use try-with-resources while creating FlushOptions in RDBStore.
> ---
>
> Key: HDDS-2472
> URL: https://issues.apache.org/jira/browse/HDDS-2472
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Link to the sonar issue flag - 
> https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2474) Remove OzoneClient exception Precondition check

2019-11-13 Thread Hanisha Koneru (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDDS-2474:
-
Status: Patch Available  (was: Open)

> Remove OzoneClient exception Precondition check
> ---
>
> Key: HDDS-2474
> URL: https://issues.apache.org/jira/browse/HDDS-2474
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If RaftCleintReply encounters an exception other than NotLeaderException, 
> NotReplicatedException, StateMachineException or LeaderNotReady, then it sets 
> success to false but there is no exception set. This causes a Precondition 
> check failure in XceiverClientRatis which expects that there should be an 
> exception if success=false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2474) Remove OzoneClient exception Precondition check

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2474:
-
Labels: pull-request-available  (was: )

> Remove OzoneClient exception Precondition check
> ---
>
> Key: HDDS-2474
> URL: https://issues.apache.org/jira/browse/HDDS-2474
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>
> If RaftCleintReply encounters an exception other than NotLeaderException, 
> NotReplicatedException, StateMachineException or LeaderNotReady, then it sets 
> success to false but there is no exception set. This causes a Precondition 
> check failure in XceiverClientRatis which expects that there should be an 
> exception if success=false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2474) Remove OzoneClient exception Precondition check

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2474?focusedWorklogId=342994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342994
 ]

ASF GitHub Bot logged work on HDDS-2474:


Author: ASF GitHub Bot
Created on: 13/Nov/19 23:24
Start Date: 13/Nov/19 23:24
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #157: 
HDDS-2474. Remove OzoneClient exception Precondition check.
URL: https://github.com/apache/hadoop-ozone/pull/157
 
 
   ## What changes were proposed in this pull request?
   
   If RaftCleintReply encounters an exception other than NotLeaderException, 
NotReplicatedException, StateMachineException or LeaderNotReady, then it sets 
success to false but there is no exception set. This causes a Precondition 
check failure in XceiverClientRatis which expects that there should be an 
exception if success=false.
   This Jira proposes to remove the Precondition check.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2474
   
   ## How was this patch tested?
   
   This patch does not require a unit test. We are just removing a Precondition 
check.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 342994)
Remaining Estimate: 0h
Time Spent: 10m

> Remove OzoneClient exception Precondition check
> ---
>
> Key: HDDS-2474
> URL: https://issues.apache.org/jira/browse/HDDS-2474
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If RaftCleintReply encounters an exception other than NotLeaderException, 
> NotReplicatedException, StateMachineException or LeaderNotReady, then it sets 
> success to false but there is no exception set. This causes a Precondition 
> check failure in XceiverClientRatis which expects that there should be an 
> exception if success=false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2476) Share more code between metadata and data scanners

2019-11-13 Thread Attila Doroszlai (Jira)

Attila Doroszlai created HDDS-2476:
--

 Summary: Share more code between metadata and data scanners
 Key: HDDS-2476
 URL: https://issues.apache.org/jira/browse/HDDS-2476
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Datanode
Reporter: Attila Doroszlai


There are several duplicated / similar pieces of code in metadata and data 
scanners.  More code should be reused.

Examples:

# ContainerDataScrubberMetrics and ContainerMetadataScrubberMetrics have 3 
common metrics
# lifecycle of ContainerMetadataScanner and ContainerDataScanner (main loop, 
iteration, metrics processing, shutdown)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2475) Unregister ContainerMetadataScrubberMetrics on thread exit

2019-11-13 Thread Attila Doroszlai (Jira)

Attila Doroszlai created HDDS-2475:
--

 Summary: Unregister ContainerMetadataScrubberMetrics on thread exit
 Key: HDDS-2475
 URL: https://issues.apache.org/jira/browse/HDDS-2475
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Datanode
Reporter: Attila Doroszlai


{{ContainerMetadataScanner}} thread should call 
{{ContainerMetadataScrubberMetrics#unregister}} before exiting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14283) DFSInputStream to prefer cached replica

2019-11-13 Thread Siyao Meng (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973771#comment-16973771
 ] 

Siyao Meng commented on HDFS-14283:
---

[~leosun08] Code looks good. Would you add test cases in {{TestDFSInputStream}} 
to demonstrate: 1) when client sets {{dfs.client.read.use.cache.priority}} to 
true, a cached block will be used; 2) when the config is set to false, it won't 
affect the current behavior of the client without the patch. Thanks!

> DFSInputStream to prefer cached replica
> ---
>
> Key: HDFS-14283
> URL: https://issues.apache.org/jira/browse/HDFS-14283
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.6.0
> Environment: HDFS Caching
>Reporter: Wei-Chiu Chuang
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14283.001.patch, HDFS-14283.002.patch, 
> HDFS-14283.003.patch, HDFS-14283.004.patch, HDFS-14283.005.patch
>
>
> HDFS Caching offers performance benefits. However, currently NameNode does 
> not treat cached replica with higher priority, so HDFS caching is only useful 
> when cache replication = 3, that is to say, all replicas are cached in 
> memory, so that a client doesn't randomly pick an uncached replica.
> HDFS-6846 proposed to let NameNode give higher priority to cached replica. 
> Changing a logic in NameNode is always tricky so that didn't get much 
> traction. Here I propose a different approach: let client (DFSInputStream) 
> prefer cached replica.
> A {{LocatedBlock}} object already contains cached replica location so a 
> client has the needed information. I think we can change 
> {{DFSInputStream#getBestNodeDNAddrPair()}} for this purpose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14283) DFSInputStream to prefer cached replica

2019-11-13 Thread Siyao Meng (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973771#comment-16973771
 ] 

Siyao Meng edited comment on HDFS-14283 at 11/13/19 10:51 PM:
--

[~leosun08] Thanks! Code looks good.

Would you add test cases in {{TestDFSInputStream}} to demonstrate:
1) when client sets {{dfs.client.read.use.cache.priority}} to true, a cached 
block will be used;
2) when the config is set to false, it won't affect the current behavior of the 
client without the patch.

And would you address the checkstyle?


was (Author: smeng):
[~leosun08] Code looks good. Would you add test cases in {{TestDFSInputStream}} 
to demonstrate: 1) when client sets {{dfs.client.read.use.cache.priority}} to 
true, a cached block will be used; 2) when the config is set to false, it won't 
affect the current behavior of the client without the patch. Thanks!

> DFSInputStream to prefer cached replica
> ---
>
> Key: HDFS-14283
> URL: https://issues.apache.org/jira/browse/HDFS-14283
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.6.0
> Environment: HDFS Caching
>Reporter: Wei-Chiu Chuang
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14283.001.patch, HDFS-14283.002.patch, 
> HDFS-14283.003.patch, HDFS-14283.004.patch, HDFS-14283.005.patch
>
>
> HDFS Caching offers performance benefits. However, currently NameNode does 
> not treat cached replica with higher priority, so HDFS caching is only useful 
> when cache replication = 3, that is to say, all replicas are cached in 
> memory, so that a client doesn't randomly pick an uncached replica.
> HDFS-6846 proposed to let NameNode give higher priority to cached replica. 
> Changing a logic in NameNode is always tricky so that didn't get much 
> traction. Here I propose a different approach: let client (DFSInputStream) 
> prefer cached replica.
> A {{LocatedBlock}} object already contains cached replica location so a 
> client has the needed information. I think we can change 
> {{DFSInputStream#getBestNodeDNAddrPair()}} for this purpose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2474) Remove OzoneClient exception Precondition check

2019-11-13 Thread Hanisha Koneru (Jira)

Hanisha Koneru created HDDS-2474:


 Summary: Remove OzoneClient exception Precondition check
 Key: HDDS-2474
 URL: https://issues.apache.org/jira/browse/HDDS-2474
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Hanisha Koneru
Assignee: Hanisha Koneru


If RaftCleintReply encounters an exception other than NotLeaderException, 
NotReplicatedException, StateMachineException or LeaderNotReady, then it sets 
success to false but there is no exception set. This causes a Precondition 
check failure in XceiverClientRatis which expects that there should be an 
exception if success=false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-1847) Datanode Kerberos principal and keytab config key looks inconsistent

2019-11-13 Thread Anu Engineer (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HDDS-1847.

Fix Version/s: 0.5.0
   Resolution: Fixed

[~chris.t...@gmail.com] Thanks for the contribution. [~elek] Thanks for 
retesting this patch. I have committed this change to the master branch.

> Datanode Kerberos principal and keytab config key looks inconsistent
> 
>
> Key: HDDS-1847
> URL: https://issues.apache.org/jira/browse/HDDS-1847
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Eric Yang
>Assignee: Chris Teoh
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Ozone Kerberos configuration can be very confusing:
> | config name | Description |
> | hdds.scm.kerberos.principal | SCM service principal |
> | hdds.scm.kerberos.keytab.file | SCM service keytab file |
> | ozone.om.kerberos.principal | Ozone Manager service principal |
> | ozone.om.kerberos.keytab.file | Ozone Manager keytab file |
> | hdds.scm.http.kerberos.principal | SCM service spnego principal |
> | hdds.scm.http.kerberos.keytab.file | SCM service spnego keytab file |
> | ozone.om.http.kerberos.principal | Ozone Manager spnego principal |
> | ozone.om.http.kerberos.keytab.file | Ozone Manager spnego keytab file |
> | hdds.datanode.http.kerberos.keytab | Datanode spnego keytab file |
> | hdds.datanode.http.kerberos.principal | Datanode spnego principal |
> | dfs.datanode.kerberos.principal | Datanode service principal |
> | dfs.datanode.keytab.file | Datanode service keytab file |
> The prefix are very different for each of the datanode configuration.  It 
> would be nice to have some consistency for datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.

2019-11-13 Thread Aravindan Vijayan (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2473:

Description: 
sonarcloud.io has flagged a number of code reliability issues in Ozone recon 
(https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon).

Following issues will be triaged / fixed.
* Double Brace Initialization should not be used
* Resources should be closed
* InterruptedException should not be ignored


> Fix code reliability issues found by Sonar in Ozone Recon module.
> -
>
> Key: HDDS-2473
> URL: https://issues.apache.org/jira/browse/HDDS-2473
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Recon
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
> Fix For: 0.5.0
>
>
> sonarcloud.io has flagged a number of code reliability issues in Ozone recon 
> (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon).
> Following issues will be triaged / fixed.
> * Double Brace Initialization should not be used
> * Resources should be closed
> * InterruptedException should not be ignored



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1847) Datanode Kerberos principal and keytab config key looks inconsistent

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-1847?focusedWorklogId=342947=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342947
 ]

ASF GitHub Bot logged work on HDDS-1847:


Author: ASF GitHub Bot
Created on: 13/Nov/19 22:01
Start Date: 13/Nov/19 22:01
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #115: HDDS-1847: 
Datanode Kerberos principal and keytab config key looks inconsistent
URL: https://github.com/apache/hadoop-ozone/pull/115
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 342947)
Time Spent: 1h 40m  (was: 1.5h)

> Datanode Kerberos principal and keytab config key looks inconsistent
> 
>
> Key: HDDS-1847
> URL: https://issues.apache.org/jira/browse/HDDS-1847
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Eric Yang
>Assignee: Chris Teoh
>Priority: Major
>  Labels: newbie, pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Ozone Kerberos configuration can be very confusing:
> | config name | Description |
> | hdds.scm.kerberos.principal | SCM service principal |
> | hdds.scm.kerberos.keytab.file | SCM service keytab file |
> | ozone.om.kerberos.principal | Ozone Manager service principal |
> | ozone.om.kerberos.keytab.file | Ozone Manager keytab file |
> | hdds.scm.http.kerberos.principal | SCM service spnego principal |
> | hdds.scm.http.kerberos.keytab.file | SCM service spnego keytab file |
> | ozone.om.http.kerberos.principal | Ozone Manager spnego principal |
> | ozone.om.http.kerberos.keytab.file | Ozone Manager spnego keytab file |
> | hdds.datanode.http.kerberos.keytab | Datanode spnego keytab file |
> | hdds.datanode.http.kerberos.principal | Datanode spnego principal |
> | dfs.datanode.kerberos.principal | Datanode service principal |
> | dfs.datanode.keytab.file | Datanode service keytab file |
> The prefix are very different for each of the datanode configuration.  It 
> would be nice to have some consistency for datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2364) Add a OM metrics to find the false positive rate for the keyMayExist

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2364?focusedWorklogId=342941=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342941
 ]

ASF GitHub Bot logged work on HDDS-2364:


Author: ASF GitHub Bot
Created on: 13/Nov/19 21:59
Start Date: 13/Nov/19 21:59
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #101: HDDS-2364. 
Add OM metrics to find the false positive rate for the keyMayExist.
URL: https://github.com/apache/hadoop-ozone/pull/101
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 342941)
Time Spent: 20m  (was: 10m)

> Add a OM metrics to find the false positive rate for the keyMayExist
> 
>
> Key: HDDS-2364
> URL: https://issues.apache.org/jira/browse/HDDS-2364
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.5.0
>Reporter: Mukul Kumar Singh
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Add a OM metrics to find the false positive rate for the keyMayExist.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-2364) Add a OM metrics to find the false positive rate for the keyMayExist

2019-11-13 Thread Anu Engineer (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HDDS-2364.

Fix Version/s: 0.5.0
   Resolution: Fixed

[~avijayan] Thanks for the contribution. [~bharat] Thanks for the reviews. I 
have committed this to the master branch.

> Add a OM metrics to find the false positive rate for the keyMayExist
> 
>
> Key: HDDS-2364
> URL: https://issues.apache.org/jira/browse/HDDS-2364
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.5.0
>Reporter: Mukul Kumar Singh
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Add a OM metrics to find the false positive rate for the keyMayExist.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.

2019-11-13 Thread Aravindan Vijayan (Jira)

Aravindan Vijayan created HDDS-2473:
---

 Summary: Fix code reliability issues found by Sonar in Ozone Recon 
module.
 Key: HDDS-2473
 URL: https://issues.apache.org/jira/browse/HDDS-2473
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Recon
Affects Versions: 0.5.0
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan
 Fix For: 0.5.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly

2019-11-13 Thread Erik Krogen (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973732#comment-16973732
 ] 

Erik Krogen edited comment on HDFS-14973 at 11/13/19 9:55 PM:
--

For (1), if we do this it will force the {{ExternalStoragePolicySatisfier}}, 
{{Mover}}, and {{Balancer}} all to use the same rate limiting (as they all make 
use of the {{NameNodeConnector}}). I'm not sure if this is desirable or not, 
any thoughts on that? This is why I passed it in externally.

For (2), I took a look, as I generally am in agreement with you that not 
recreating miniclusters is a good thing. I don't see a good way to do it here, 
since {{doTest}} makes use of the ability to configure the exact capacities of 
the DataNodes it starts up in the new cluster.  I tried refactoring 
{{TestBalancer}} a bit to make it so that there is a {{setupMiniCluster}} step 
followed by a {{doTestWithoutClusterSetup}} step, but then it becomes very 
difficult to properly fill the cluster _only_ on the original DataNodes and not 
the new DataNode, leaving a single one empty. Let me know if you have really 
strong feelings about this and I can try some more, but for now it seems like 
it would require some substantial refactoring of {{TestBalanacer}}.

For running against the default parameter of 20, it wasn't possible with the 
existing code since it asserts that the number of getBlocks calls is _higher_ 
than the rate limit. I changed it to greater-than-or-equal-to instead of a 
strict greater-than to get it to work with the default of 20. Uploaded v002 
patch with this change.


was (Author: xkrogen):
For (1), if we do this it will force the {{ExternalStoragePolicySatisfier}}, 
{{Mover}}, and {{Balancer}} all to use the same rate limiting (as they all make 
use of the {{NameNodeConnector}}. I'm not sure if this is desirable or not, any 
thoughts on that? This is why I passed it in externally.

For (2), I took a look, as I generally am in agreement with you that not 
recreating miniclusters is a good thing. I don't see a good way to do it here, 
since {{doTest}} makes use of the ability to configure the exact capacities of 
the DataNodes it starts up in the new cluster.  I tried refactoring 
{{TestBalancer}} a bit to make it so that there is a {{setupMiniCluster}} step 
followed by a {{doTestWithoutClusterSetup}} step, but then it becomes very 
difficult to properly fill the cluster _only_ on the original DataNodes and not 
the new DataNode, leaving a single one empty. Let me know if you have really 
strong feelings about this and I can try some more, but for now it seems like 
it would require some substantial refactoring of {{TestBalanacer}}.

For running against the default parameter of 20, it wasn't possible with the 
existing code since it asserts that the number of getBlocks calls is _higher_ 
than the rate limit. I changed it to greater-than-or-equal-to instead of a 
strict greater-than to get it to work with the default of 20. Uploaded v002 
patch with this change.

> Balancer getBlocks RPC dispersal does not function properly
> ---
>
> Key: HDFS-14973
> URL: https://issues.apache.org/jira/browse/HDFS-14973
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.9.0, 2.7.4, 2.8.2, 3.0.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14973.000.patch, HDFS-14973.001.patch, 
> HDFS-14973.002.patch, HDFS-14973.test.patch
>
>
> In HDFS-11384, a mechanism was added to make the {{getBlocks}} RPC calls 
> issued by the Balancer/Mover more dispersed, to alleviate load on the 
> NameNode, since {{getBlocks}} can be very expensive and the Balancer should 
> not impact normal cluster operation.
> Unfortunately, this functionality does not function as expected, especially 
> when the dispatcher thread count is low. The primary issue is that the delay 
> is applied only to the first N threads that are submitted to the dispatcher's 
> executor, where N is the size of the dispatcher's threadpool, but *not* to 
> the first R threads, where R is the number of allowed {{getBlocks}} QPS 
> (currently hardcoded to 20). For example, if the threadpool size is 100 (the 
> default), threads 0-19 have no delay, 20-99 have increased levels of delay, 
> and 100+ have no delay. As I understand it, the intent of the logic was that 
> the delay applied to the first 100 threads would force the dispatcher 
> executor's threads to all be consumed, thus blocking subsequent (non-delayed) 
> threads until the delay period has expired. However, threads 0-19 can finish 
> very quickly (their work can often be fulfilled in the time it takes to 
> execute a single {{getBlocks}} RPC, on the order of tens of milliseconds), 
>

[jira] [Commented] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly

2019-11-13 Thread Erik Krogen (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973732#comment-16973732
 ] 

Erik Krogen commented on HDFS-14973:


For (1), if we do this it will force the {{ExternalStoragePolicySatisfier}}, 
{{Mover}}, and {{Balancer}} all to use the same rate limiting (as they all make 
use of the {{NameNodeConnector}}. I'm not sure if this is desirable or not, any 
thoughts on that? This is why I passed it in externally.

For (2), I took a look, as I generally am in agreement with you that not 
recreating miniclusters is a good thing. I don't see a good way to do it here, 
since {{doTest}} makes use of the ability to configure the exact capacities of 
the DataNodes it starts up in the new cluster.  I tried refactoring 
{{TestBalancer}} a bit to make it so that there is a {{setupMiniCluster}} step 
followed by a {{doTestWithoutClusterSetup}} step, but then it becomes very 
difficult to properly fill the cluster _only_ on the original DataNodes and not 
the new DataNode, leaving a single one empty. Let me know if you have really 
strong feelings about this and I can try some more, but for now it seems like 
it would require some substantial refactoring of {{TestBalanacer}}.

For running against the default parameter of 20, it wasn't possible with the 
existing code since it asserts that the number of getBlocks calls is _higher_ 
than the rate limit. I changed it to greater-than-or-equal-to instead of a 
strict greater-than to get it to work with the default of 20. Uploaded v002 
patch with this change.

> Balancer getBlocks RPC dispersal does not function properly
> ---
>
> Key: HDFS-14973
> URL: https://issues.apache.org/jira/browse/HDFS-14973
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.9.0, 2.7.4, 2.8.2, 3.0.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14973.000.patch, HDFS-14973.001.patch, 
> HDFS-14973.002.patch, HDFS-14973.test.patch
>
>
> In HDFS-11384, a mechanism was added to make the {{getBlocks}} RPC calls 
> issued by the Balancer/Mover more dispersed, to alleviate load on the 
> NameNode, since {{getBlocks}} can be very expensive and the Balancer should 
> not impact normal cluster operation.
> Unfortunately, this functionality does not function as expected, especially 
> when the dispatcher thread count is low. The primary issue is that the delay 
> is applied only to the first N threads that are submitted to the dispatcher's 
> executor, where N is the size of the dispatcher's threadpool, but *not* to 
> the first R threads, where R is the number of allowed {{getBlocks}} QPS 
> (currently hardcoded to 20). For example, if the threadpool size is 100 (the 
> default), threads 0-19 have no delay, 20-99 have increased levels of delay, 
> and 100+ have no delay. As I understand it, the intent of the logic was that 
> the delay applied to the first 100 threads would force the dispatcher 
> executor's threads to all be consumed, thus blocking subsequent (non-delayed) 
> threads until the delay period has expired. However, threads 0-19 can finish 
> very quickly (their work can often be fulfilled in the time it takes to 
> execute a single {{getBlocks}} RPC, on the order of tens of milliseconds), 
> thus opening up 20 new slots in the executor, which are then consumed by 
> non-delayed threads 100-119, and so on. So, although 80 threads have had a 
> delay applied, the non-delay threads rush through in the 20 non-delay slots.
> This problem gets even worse when the dispatcher threadpool size is less than 
> the max {{getBlocks}} QPS. For example, if the threadpool size is 10, _no 
> threads ever have a delay applied_, and the feature is not enabled at all.
> This problem wasn't surfaced in the original JIRA because the test 
> incorrectly measured the period across which {{getBlocks}} RPCs were 
> distributed. The variables {{startGetBlocksTime}} and {{endGetBlocksTime}} 
> were used to track the time over which the {{getBlocks}} calls were made. 
> However, {{startGetBlocksTime}} was initialized at the time of creation of 
> the {{FSNameystem}} spy, which is before the mock DataNodes are started. Even 
> worse, the Balancer in this test takes 2 iterations to complete balancing the 
> cluster, so the time period {{endGetBlocksTime - startGetBlocksTime}} 
> actually represents:
> {code}
> (time to submit getBlocks RPCs) + (DataNode startup time) + (time for the 
> Dispatcher to complete an iteration of moving blocks)
> {code}
> Thus, the RPC QPS reported by the test is much lower than the RPC QPS seen 
> during the period of initial block fetching.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly

2019-11-13 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14973:
---
Attachment: HDFS-14973.002.patch

> Balancer getBlocks RPC dispersal does not function properly
> ---
>
> Key: HDFS-14973
> URL: https://issues.apache.org/jira/browse/HDFS-14973
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.9.0, 2.7.4, 2.8.2, 3.0.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14973.000.patch, HDFS-14973.001.patch, 
> HDFS-14973.002.patch, HDFS-14973.test.patch
>
>
> In HDFS-11384, a mechanism was added to make the {{getBlocks}} RPC calls 
> issued by the Balancer/Mover more dispersed, to alleviate load on the 
> NameNode, since {{getBlocks}} can be very expensive and the Balancer should 
> not impact normal cluster operation.
> Unfortunately, this functionality does not function as expected, especially 
> when the dispatcher thread count is low. The primary issue is that the delay 
> is applied only to the first N threads that are submitted to the dispatcher's 
> executor, where N is the size of the dispatcher's threadpool, but *not* to 
> the first R threads, where R is the number of allowed {{getBlocks}} QPS 
> (currently hardcoded to 20). For example, if the threadpool size is 100 (the 
> default), threads 0-19 have no delay, 20-99 have increased levels of delay, 
> and 100+ have no delay. As I understand it, the intent of the logic was that 
> the delay applied to the first 100 threads would force the dispatcher 
> executor's threads to all be consumed, thus blocking subsequent (non-delayed) 
> threads until the delay period has expired. However, threads 0-19 can finish 
> very quickly (their work can often be fulfilled in the time it takes to 
> execute a single {{getBlocks}} RPC, on the order of tens of milliseconds), 
> thus opening up 20 new slots in the executor, which are then consumed by 
> non-delayed threads 100-119, and so on. So, although 80 threads have had a 
> delay applied, the non-delay threads rush through in the 20 non-delay slots.
> This problem gets even worse when the dispatcher threadpool size is less than 
> the max {{getBlocks}} QPS. For example, if the threadpool size is 10, _no 
> threads ever have a delay applied_, and the feature is not enabled at all.
> This problem wasn't surfaced in the original JIRA because the test 
> incorrectly measured the period across which {{getBlocks}} RPCs were 
> distributed. The variables {{startGetBlocksTime}} and {{endGetBlocksTime}} 
> were used to track the time over which the {{getBlocks}} calls were made. 
> However, {{startGetBlocksTime}} was initialized at the time of creation of 
> the {{FSNameystem}} spy, which is before the mock DataNodes are started. Even 
> worse, the Balancer in this test takes 2 iterations to complete balancing the 
> cluster, so the time period {{endGetBlocksTime - startGetBlocksTime}} 
> actually represents:
> {code}
> (time to submit getBlocks RPCs) + (DataNode startup time) + (time for the 
> Dispatcher to complete an iteration of moving blocks)
> {code}
> Thus, the RPC QPS reported by the test is much lower than the RPC QPS seen 
> during the period of initial block fetching.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.

2019-11-13 Thread Aravindan Vijayan (Jira)

Aravindan Vijayan created HDDS-2472:
---

 Summary: Use try-with-resources while creating FlushOptions in 
RDBStore.
 Key: HDDS-2472
 URL: https://issues.apache.org/jira/browse/HDDS-2472
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Affects Versions: 0.5.0
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan
 Fix For: 0.5.0


Link to the sonar issue flag - 
https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4.
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-2412) Define description/topics/merge strategy for the github repository with .asf.yaml

2019-11-13 Thread Anu Engineer (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HDDS-2412.

Fix Version/s: 0.5.0
   Resolution: Fixed

Thanks, I have committed this patch to the master. [~elek] Thanks for the 
contribution. [~adoroszlai] Thanks for the reviews.

> Define description/topics/merge strategy for the github repository with 
> .asf.yaml
> -
>
> Key: HDDS-2412
> URL: https://issues.apache.org/jira/browse/HDDS-2412
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> .asf.yaml helps to set different parameters on github repositories without 
> admin privileges:
> [https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories]
> This basic .asf.yaml defines description/url/topics and the allowed merge 
> buttons.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2412) Define description/topics/merge strategy for the github repository with .asf.yaml

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2412?focusedWorklogId=342931=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342931
 ]

ASF GitHub Bot logged work on HDDS-2412:


Author: ASF GitHub Bot
Created on: 13/Nov/19 21:48
Start Date: 13/Nov/19 21:48
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #125: HDDS-2412. 
Define description/topics/merge strategy for the github repository
URL: https://github.com/apache/hadoop-ozone/pull/125
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 342931)
Time Spent: 20m  (was: 10m)

> Define description/topics/merge strategy for the github repository with 
> .asf.yaml
> -
>
> Key: HDDS-2412
> URL: https://issues.apache.org/jira/browse/HDDS-2412
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> .asf.yaml helps to set different parameters on github repositories without 
> admin privileges:
> [https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories]
> This basic .asf.yaml defines description/url/topics and the allowed merge 
> buttons.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-2400) Enable github actions based builds for Ozone

2019-11-13 Thread Anu Engineer (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HDDS-2400.

Fix Version/s: 0.5.0
   Resolution: Fixed

Thanks, Committed to the master.

> Enable github actions based builds for Ozone
> 
>
> Key: HDDS-2400
> URL: https://issues.apache.org/jira/browse/HDDS-2400
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: build
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Current PR checks are executed in a private branch based on the scripts in 
> [https://github.com/elek/argo-ozone]
> but the results are stored in a public repositories:
> [https://github.com/elek/ozone-ci-q4|https://github.com/elek/ozone-ci-q3]
> [https://github.com/elek/ozone-ci-03]
>  
> As we discussed during the community calls, it would be great to use github 
> actions (or any other cloud based build) to make all the build definitions 
> more accessible for the community.
> [~vivekratnavel] checked CircleCI which has better reporting capabilities. 
> But INFRA has concerns about the permission model of circle-ci:
> {quote}it is highly unlikley we will allow a bot to be able to commit code 
> (whether or not that is the intention, allowing circle-ci will make this 
> possible, and is a complete no)
> {quote}
> See:
> https://issues.apache.org/jira/browse/INFRA-18131
> [https://lists.apache.org/thread.html/af52e2a3e865c01596d46374e8b294f2740587dbd59d85e132429b6c@%3Cbuilds.apache.org%3E]
>  
> Fortunately we have a clear contract. Or build scripts are stored under 
> _hadoop-ozone/dev-support/checks_ (return code show the result, details are 
> printed out to the console output). It's very easy to experiment with 
> different build systems.
>  
> Github action seems to be an obvious choice: it's integrated well with GitHub 
> and it has more generous resource limitations.
>  
> With this Jira I propose to enable github actions based PR checks for a few 
> tests (author, rat, unit, acceptance, checkstyle, findbugs) as an experiment.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2400) Enable github actions based builds for Ozone

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2400?focusedWorklogId=342929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342929
 ]

ASF GitHub Bot logged work on HDDS-2400:


Author: ASF GitHub Bot
Created on: 13/Nov/19 21:45
Start Date: 13/Nov/19 21:45
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #122: HDDS-2400. 
Enable github actions based builds for Ozone
URL: https://github.com/apache/hadoop-ozone/pull/122
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 342929)
Time Spent: 20m  (was: 10m)

> Enable github actions based builds for Ozone
> 
>
> Key: HDDS-2400
> URL: https://issues.apache.org/jira/browse/HDDS-2400
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: build
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Current PR checks are executed in a private branch based on the scripts in 
> [https://github.com/elek/argo-ozone]
> but the results are stored in a public repositories:
> [https://github.com/elek/ozone-ci-q4|https://github.com/elek/ozone-ci-q3]
> [https://github.com/elek/ozone-ci-03]
>  
> As we discussed during the community calls, it would be great to use github 
> actions (or any other cloud based build) to make all the build definitions 
> more accessible for the community.
> [~vivekratnavel] checked CircleCI which has better reporting capabilities. 
> But INFRA has concerns about the permission model of circle-ci:
> {quote}it is highly unlikley we will allow a bot to be able to commit code 
> (whether or not that is the intention, allowing circle-ci will make this 
> possible, and is a complete no)
> {quote}
> See:
> https://issues.apache.org/jira/browse/INFRA-18131
> [https://lists.apache.org/thread.html/af52e2a3e865c01596d46374e8b294f2740587dbd59d85e132429b6c@%3Cbuilds.apache.org%3E]
>  
> Fortunately we have a clear contract. Or build scripts are stored under 
> _hadoop-ozone/dev-support/checks_ (return code show the result, details are 
> printed out to the console output). It's very easy to experiment with 
> different build systems.
>  
> Github action seems to be an obvious choice: it's integrated well with GitHub 
> and it has more generous resource limitations.
>  
> With this Jira I propose to enable github actions based PR checks for a few 
> tests (author, rat, unit, acceptance, checkstyle, findbugs) as an experiment.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-2392) Fix TestScmSafeMode#testSCMSafeModeRestrictedOp

2019-11-13 Thread Hanisha Koneru (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973716#comment-16973716
 ] 

Hanisha Koneru edited comment on HDDS-2392 at 11/13/19 9:30 PM:


Thank you [~avijayan].
This issue is fixed by RATIS-747


was (Author: hanishakoneru):
Fixed by RATIS-747

> Fix TestScmSafeMode#testSCMSafeModeRestrictedOp
> ---
>
> Key: HDDS-2392
> URL: https://issues.apache.org/jira/browse/HDDS-2392
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Blocker
>
> After ratis upgrade (HDDS-2340), TestScmSafeMode#testSCMSafeModeRestrictedOp 
> fails as the DNs fail to restart XceiverServerRatis. 
> RaftServer#start() fails with following exception:
> {code:java}
> java.io.IOException: java.lang.IllegalStateException: Not started
>   at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54)
>   at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61)
>   at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:284)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:296)
>   at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.start(XceiverServerRatis.java:421)
>   at 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:215)
>   at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:110)
>   at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Not started
>   at 
> org.apache.ratis.thirdparty.com.google.common.base.Preconditions.checkState(Preconditions.java:504)
>   at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.getPort(ServerImpl.java:176)
>   at 
> org.apache.ratis.grpc.server.GrpcService.lambda$new$2(GrpcService.java:143)
>   at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62)
>   at 
> org.apache.ratis.grpc.server.GrpcService.getInetSocketAddress(GrpcService.java:182)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.lambda$new$0(RaftServerImpl.java:84)
>   at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.getPeer(RaftServerImpl.java:136)
>   at 
> org.apache.ratis.server.impl.RaftServerMetrics.(RaftServerMetrics.java:70)
>   at 
> org.apache.ratis.server.impl.RaftServerMetrics.getRaftServerMetrics(RaftServerMetrics.java:62)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.(RaftServerImpl.java:119)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208)
>   at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-2392) Fix TestScmSafeMode#testSCMSafeModeRestrictedOp

2019-11-13 Thread Hanisha Koneru (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru resolved HDDS-2392.
--
Resolution: Fixed

Fixed by RATIS-747

> Fix TestScmSafeMode#testSCMSafeModeRestrictedOp
> ---
>
> Key: HDDS-2392
> URL: https://issues.apache.org/jira/browse/HDDS-2392
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Blocker
>
> After ratis upgrade (HDDS-2340), TestScmSafeMode#testSCMSafeModeRestrictedOp 
> fails as the DNs fail to restart XceiverServerRatis. 
> RaftServer#start() fails with following exception:
> {code:java}
> java.io.IOException: java.lang.IllegalStateException: Not started
>   at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54)
>   at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61)
>   at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:284)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:296)
>   at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.start(XceiverServerRatis.java:421)
>   at 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:215)
>   at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:110)
>   at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Not started
>   at 
> org.apache.ratis.thirdparty.com.google.common.base.Preconditions.checkState(Preconditions.java:504)
>   at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.getPort(ServerImpl.java:176)
>   at 
> org.apache.ratis.grpc.server.GrpcService.lambda$new$2(GrpcService.java:143)
>   at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62)
>   at 
> org.apache.ratis.grpc.server.GrpcService.getInetSocketAddress(GrpcService.java:182)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.lambda$new$0(RaftServerImpl.java:84)
>   at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.getPeer(RaftServerImpl.java:136)
>   at 
> org.apache.ratis.server.impl.RaftServerMetrics.(RaftServerMetrics.java:70)
>   at 
> org.apache.ratis.server.impl.RaftServerMetrics.getRaftServerMetrics(RaftServerMetrics.java:62)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.(RaftServerImpl.java:119)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208)
>   at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-2471) Improve exception message for CompleteMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-2471:


Assignee: Bharat Viswanadham

> Improve exception message for CompleteMultipartUpload
> -
>
> Key: HDDS-2471
> URL: https://issues.apache.org/jira/browse/HDDS-2471
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When InvalidPart error occurs, the exception message does not have any 
> information about partName and partNumber, it will be good to have this 
> information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2471) Improve exception message for CompleteMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2471:
-
Status: Patch Available  (was: Open)

> Improve exception message for CompleteMultipartUpload
> -
>
> Key: HDDS-2471
> URL: https://issues.apache.org/jira/browse/HDDS-2471
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When InvalidPart error occurs, the exception message does not have any 
> information about partName and partNumber, it will be good to have this 
> information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2471) Improve exception message for CompleteMultipartUpload

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2471?focusedWorklogId=342876=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342876
 ]

ASF GitHub Bot logged work on HDDS-2471:


Author: ASF GitHub Bot
Created on: 13/Nov/19 20:54
Start Date: 13/Nov/19 20:54
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #156: 
HDDS-2471. Improve exception message for CompleteMultipartUpload.
URL: https://github.com/apache/hadoop-ozone/pull/156
 
 
   ## What changes were proposed in this pull request?
   
   Add partName, partNumber to exception message when InvalidPart error occurs, 
this will help during debugging issue when reading logs.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2471
   
   Please replace this section with the link to the Apache JIRA)
   
   ## How was this patch tested?
   
   Just added more information to exception message.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 342876)
Remaining Estimate: 0h
Time Spent: 10m

> Improve exception message for CompleteMultipartUpload
> -
>
> Key: HDDS-2471
> URL: https://issues.apache.org/jira/browse/HDDS-2471
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When InvalidPart error occurs, the exception message does not have any 
> information about partName and partNumber, it will be good to have this 
> information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2471) Improve exception message for CompleteMultipartUpload

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2471:
-
Labels: pull-request-available  (was: )

> Improve exception message for CompleteMultipartUpload
> -
>
> Key: HDDS-2471
> URL: https://issues.apache.org/jira/browse/HDDS-2471
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>
> When InvalidPart error occurs, the exception message does not have any 
> information about partName and partNumber, it will be good to have this 
> information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2471) Improve exception message for CompleteMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)

Bharat Viswanadham created HDDS-2471:


 Summary: Improve exception message for CompleteMultipartUpload
 Key: HDDS-2471
 URL: https://issues.apache.org/jira/browse/HDDS-2471
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


When InvalidPart error occurs, the exception message does not have any 
information about partName and partNumber, it will be good to have this 
information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14985) FSCK for a block of EC Files doesnt display status at the end

2019-11-13 Thread Surendra Singh Lilhore (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973661#comment-16973661
 ] 

Surendra Singh Lilhore commented on HDFS-14985:
---

[~Sushma_28], it is duplicate of HDFS-14987. You can work on HDFS-14987.

> FSCK for a block of EC Files doesnt display status at the end
> -
>
> Key: HDFS-14985
> URL: https://issues.apache.org/jira/browse/HDFS-14985
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> *Environment*  Cluster of 2 Namenodes and 5 Datanodes and ec policy enabled  
> fsck -blockId of a block associated with an EC File does not print status at 
> the end and displays null instead
> {color:#de350b}*Result :*{color}
> ./hdfs fsck -blockId blk_-x
>  Connecting to namenode via 
>  FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST 
> 2019
> Block Id: blk_-x
>  Block belongs to: /ecdir/f2
>  No. of Expected Replica: 3
>  No. of live Replica: 3
>  No. of excess Replica: 0
>  No. of stale Replica: 2
>  No. of decommissioned Replica: 0
>  No. of decommissioning Replica: 0
>  No. of corrupted Replica: 0
>  null
> {color:#de350b}*Expected :*{color} 
> ./hdfs fsck -blockId blk_-x
>  Connecting to namenode via 
>  FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST 
> 2019
> Block Id: blk_-x
>  Block belongs to: /ecdir/f2
>  No. of Expected Replica: 3
>  No. of live Replica: 3
>  No. of excess Replica: 0
>  No. of stale Replica: 2
>  No. of decommissioned Replica: 0
>  No. of decommissioning Replica: 0
>  No. of corrupted Replica: 0
> Block replica on datanode/rack: vm10/default-rack is HEALTHY
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible

2019-11-13 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14979:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> [Observer Node] Balancer should submit getBlocks to Observer Node when 
> possible
> ---
>
> Key: HDFS-14979
> URL: https://issues.apache.org/jira/browse/HDFS-14979
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, hdfs
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1, 2.11.0
>
> Attachments: HDFS-14979.000.patch
>
>
> In HDFS-14162, we made it so that the Balancer could function when 
> {{ObserverReadProxyProvider}} was in use. However, the Balancer would still 
> read from the active NameNode, because {{getBlocks}} wasn't annotated as 
> {{@ReadOnly}}. This task is to enable the Balancer to actually read from the 
> Observer Node to alleviate load from the active NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible

2019-11-13 Thread Erik Krogen (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973656#comment-16973656
 ] 

Erik Krogen edited comment on HDFS-14979 at 11/13/19 8:10 PM:
--

Committed this down to branch-2.10 (trunk, branch-3.2, branch-3.1, branch-2, 
branch-2.10). Wasn't sure if I should do branch-2 given that I think we've 
decided 2.10 is the last 2.x release, but figured it couldn't hurt. Thanks for 
the review [~shv]!


was (Author: xkrogen):
Committed this down to branch-2.10 (trunk, branch-3.2, branch-3.1, branch-2, 
branch-2.10). Wasn't sure if I should do branch-2 given that I think we've 
decided 2.10 is the last 2.x release, but figured it couldn't hurt.

> [Observer Node] Balancer should submit getBlocks to Observer Node when 
> possible
> ---
>
> Key: HDFS-14979
> URL: https://issues.apache.org/jira/browse/HDFS-14979
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, hdfs
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1, 2.11.0
>
> Attachments: HDFS-14979.000.patch
>
>
> In HDFS-14162, we made it so that the Balancer could function when 
> {{ObserverReadProxyProvider}} was in use. However, the Balancer would still 
> read from the active NameNode, because {{getBlocks}} wasn't annotated as 
> {{@ReadOnly}}. This task is to enable the Balancer to actually read from the 
> Observer Node to alleviate load from the active NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible

2019-11-13 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14979:
---
Fix Version/s: 2.11.0
   2.10.1
   3.2.2
   3.1.4
   3.3.0

> [Observer Node] Balancer should submit getBlocks to Observer Node when 
> possible
> ---
>
> Key: HDFS-14979
> URL: https://issues.apache.org/jira/browse/HDFS-14979
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, hdfs
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1, 2.11.0
>
> Attachments: HDFS-14979.000.patch
>
>
> In HDFS-14162, we made it so that the Balancer could function when 
> {{ObserverReadProxyProvider}} was in use. However, the Balancer would still 
> read from the active NameNode, because {{getBlocks}} wasn't annotated as 
> {{@ReadOnly}}. This task is to enable the Balancer to actually read from the 
> Observer Node to alleviate load from the active NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible

2019-11-13 Thread Erik Krogen (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973656#comment-16973656
 ] 

Erik Krogen commented on HDFS-14979:


Committed this down to branch-2.10 (trunk, branch-3.2, branch-3.1, branch-2, 
branch-2.10). Wasn't sure if I should do branch-2 given that I think we've 
decided 2.10 is the last 2.x release, but figured it couldn't hurt.

> [Observer Node] Balancer should submit getBlocks to Observer Node when 
> possible
> ---
>
> Key: HDFS-14979
> URL: https://issues.apache.org/jira/browse/HDFS-14979
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, hdfs
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14979.000.patch
>
>
> In HDFS-14162, we made it so that the Balancer could function when 
> {{ObserverReadProxyProvider}} was in use. However, the Balancer would still 
> read from the active NameNode, because {{getBlocks}} wasn't annotated as 
> {{@ReadOnly}}. This task is to enable the Balancer to actually read from the 
> Observer Node to alleviate load from the active NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14655) [SBN Read] Namenode crashes if one of The JN is down

2019-11-13 Thread Chen Liang (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973655#comment-16973655
 ] 

Chen Liang commented on HDFS-14655:
---

[~xkrogen] seems like a different message, looks to me that this one happens 
when a {{Future}} instance got cancelled.

> [SBN Read] Namenode crashes if one of The JN is down
> 
>
> Key: HDFS-14655
> URL: https://issues.apache.org/jira/browse/HDFS-14655
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Harshakiran Reddy
>Assignee: Ayush Saxena
>Priority: Critical
> Fix For: 2.10.0, 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-14655-01.patch, HDFS-14655-02.patch, 
> HDFS-14655-03.patch, HDFS-14655-04.patch, HDFS-14655-05.patch, 
> HDFS-14655-06.patch, HDFS-14655-07.patch, HDFS-14655-08.patch, 
> HDFS-14655-branch-2-01.patch, HDFS-14655-branch-2-02.patch, 
> HDFS-14655.poc.patch
>
>
> {noformat}
> 2019-07-04 17:35:54,064 | INFO  | Logger channel (from parallel executor) to 
> XXX/XXX | Retrying connect to server: XXX/XXX. Already tried 
> 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
> sleepTime=1000 MILLISECONDS) | Client.java:975
> 2019-07-04 17:35:54,087 | FATAL | Edit log tailer | Unknown error encountered 
> while tailing edits. Shutting down standby NN. | EditLogTailer.java:474
> java.lang.OutOfMemoryError: unable to create new native thread
>   at java.lang.Thread.start0(Native Method)
>   at java.lang.Thread.start(Thread.java:717)
>   at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378)
>   at 
> com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:440)
>   at 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:56)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.getJournaledEdits(IPCLoggerChannel.java:565)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.getJournaledEdits(AsyncLoggerSet.java:272)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectRpcInputStreams(QuorumJournalManager.java:533)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:508)
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:275)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1681)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1714)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:307)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:410)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:360)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
>   at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:483)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423)
> 2019-07-04 17:35:54,112 | INFO  | Edit log tailer | Exiting with status 1: 
> java.lang.OutOfMemoryError: unable to create new native thread | 
> ExitUtil.java:210
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14924) RenameSnapshot not updating new modification time

2019-11-13 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973653#comment-16973653
 ] 

Hadoop QA commented on HDFS-14924:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 37s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 622 unchanged - 0 fixed = 623 total (was 622) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 22s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  1m 
13s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 36s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14924 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12985768/HDFS-14924.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 08f5ffcb26dc 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / df6b316 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28308/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28308/artifact/out/diff-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit |

[jira] [Commented] (HDDS-2468) scmcli close pipeline command not working

2019-11-13 Thread Hanisha Koneru (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973647#comment-16973647
 ] 

Hanisha Koneru commented on HDDS-2468:
--

Deactivate Pipeline is also failing with the same error.

> scmcli close pipeline command not working
> -
>
> Key: HDDS-2468
> URL: https://issues.apache.org/jira/browse/HDDS-2468
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Rajesh Balamohan
>Assignee: Nanda kumar
>Priority: Major
>
> Close pipeline command is failing with the following exception
> {noformat}
> java.lang.IllegalArgumentException: Unknown command type: ClosePipeline
>   at 
> org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.processRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:219)
>   at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
>   at 
> org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.submitRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:112)
>   at 
> org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:29883)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible

2019-11-13 Thread Hudson (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973648#comment-16973648
 ] 

Hudson commented on HDFS-14979:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17638 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17638/])
HDFS-14979 Allow Balancer to submit getBlocks calls to Observer Nodes (xkrogen: 
rev 586defe7113ed246ed0275bb3833882a3d873d70)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamenodeProtocol.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithHANameNodes.java


> [Observer Node] Balancer should submit getBlocks to Observer Node when 
> possible
> ---
>
> Key: HDFS-14979
> URL: https://issues.apache.org/jira/browse/HDFS-14979
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, hdfs
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14979.000.patch
>
>
> In HDFS-14162, we made it so that the Balancer could function when 
> {{ObserverReadProxyProvider}} was in use. However, the Balancer would still 
> read from the active NameNode, because {{getBlocks}} wasn't annotated as 
> {{@ReadOnly}}. This task is to enable the Balancer to actually read from the 
> Observer Node to alleviate load from the active NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible

2019-11-13 Thread Konstantin Shvachko (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973625#comment-16973625
 ] 

Konstantin Shvachko commented on HDFS-14979:


+1 LGTM

> [Observer Node] Balancer should submit getBlocks to Observer Node when 
> possible
> ---
>
> Key: HDFS-14979
> URL: https://issues.apache.org/jira/browse/HDFS-14979
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, hdfs
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14979.000.patch
>
>
> In HDFS-14162, we made it so that the Balancer could function when 
> {{ObserverReadProxyProvider}} was in use. However, the Balancer would still 
> read from the active NameNode, because {{getBlocks}} wasn't annotated as 
> {{@ReadOnly}}. This task is to enable the Balancer to actually read from the 
> Observer Node to alleviate load from the active NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2470:
-
Description: 
Right now when complete Multipart Upload is not printing partName and  
partNumber into the audit log. This will help in analyzing audit logs for MPU.

 

 

2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=xx.xx.xx.xx | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, 
bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, 
replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID {

  containerBlockID

{     containerID: 2     localID: 103129366531867089   }

  blockCommitSequenceId: 4978

}

offset: 0

length: 5242880

createVersion: 0

pipeline {

  leaderID: ""

  members {

    uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    ipAddress: "xx.xx.xx.xx"

    hostName: "xx.xx.xx.xx"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    ipAddress: "9.134.51.25"

    hostName: "9.134.51.25"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    ipAddress: "9.134.51.215"

    hostName: "9.134.51.215"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    networkLocation: "/default-rack"

  }

  state: PIPELINE_OPEN

  type: RATIS

  factor: THREE

  id

{     id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"   }

}

]} | ret=SUCCESS |

  was:
Right now when complete Multipart Upload is not printing partName and  
partNumber into the audit log. This will help in analyzing audit logs for MPU.

 

 

2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, 
bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, 
replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID {

  containerBlockID

{     containerID: 2     localID: 103129366531867089   }

  blockCommitSequenceId: 4978

}

offset: 0

length: 5242880

createVersion: 0

pipeline {

  leaderID: ""

  members {

    uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    ipAddress: "9.134.51.232"

    hostName: "9.134.51.232"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    ipAddress: "9.134.51.25"

    hostName: "9.134.51.25"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    ipAddress: "9.134.51.215"

    hostName: "9.134.51.215"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    networkLocation: "/default-rack"

  }

  state: PIPELINE_OPEN

  type: RATIS

  factor: THREE

  id

{     id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"   }

}

]} | ret=SUCCESS |


> Add partName, partNumber for CommitMultipartUpload
> --
>
> Key: HDDS-2470
> URL: https://issues.apache.org/jira/browse/HDDS-2470
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now when complete Multipart Upload is not printing partName and  
> partNumber into the audit log. This will help in analyzing audit logs for MPU.
>  
>  
> 2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=xx.xx.xx.xx | 
> op=COMMIT_MULTIPART_UPLOAD_PARTKEY 
> {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
> key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, 
> replicationFactor=ONE, keyLocationInfo=[blockID {
>   containerBlockID
> {     containerID: 2     localID: 103129366531867089   }
>   blockCommitSequenceId: 4978
> }
> offset: 0
> length: 5242880
> createVersion: 0
> pipeline

[jira] [Updated] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2470:
-
Status: Patch Available  (was: Open)

> Add partName, partNumber for CommitMultipartUpload
> --
>
> Key: HDDS-2470
> URL: https://issues.apache.org/jira/browse/HDDS-2470
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now when complete Multipart Upload is not printing partName and  
> partNumber into the audit log. This will help in analyzing audit logs for MPU.
>  
>  
> 2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
> op=COMMIT_MULTIPART_UPLOAD_PARTKEY 
> {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
> key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, 
> replicationFactor=ONE, keyLocationInfo=[blockID {
>   containerBlockID
> {     containerID: 2     localID: 103129366531867089   }
>   blockCommitSequenceId: 4978
> }
> offset: 0
> length: 5242880
> createVersion: 0
> pipeline {
>   leaderID: ""
>   members {
>     uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"
>     ipAddress: "9.134.51.232"
>     hostName: "9.134.51.232"
>     ports
> {       name: "RATIS"       value: 9858     }
>     ports
> {       name: "STANDALONE"       value: 9859     }
>     networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"
>     networkLocation: "/default-rack"
>   }
>   members {
>     uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"
>     ipAddress: "9.134.51.25"
>     hostName: "9.134.51.25"
>     ports
> {       name: "RATIS"       value: 9858     }
>     ports
> {       name: "STANDALONE"       value: 9859     }
>     networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"
>     networkLocation: "/default-rack"
>   }
>   members {
>     uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"
>     ipAddress: "9.134.51.215"
>     hostName: "9.134.51.215"
>     ports
> {       name: "RATIS"       value: 9858     }
>     ports
> {       name: "STANDALONE"       value: 9859     }
>     networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"
>     networkLocation: "/default-rack"
>   }
>   state: PIPELINE_OPEN
>   type: RATIS
>   factor: THREE
>   id
> {     id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"   }
> }
> ]} | ret=SUCCESS |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2470:
-
Labels: pull-request-available  (was: )

> Add partName, partNumber for CommitMultipartUpload
> --
>
> Key: HDDS-2470
> URL: https://issues.apache.org/jira/browse/HDDS-2470
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>
> Right now when complete Multipart Upload is not printing partName and  
> partNumber into the audit log. This will help in analyzing audit logs for MPU.
>  
>  
> 2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
> op=COMMIT_MULTIPART_UPLOAD_PARTKEY 
> {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
> key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, 
> replicationFactor=ONE, keyLocationInfo=[blockID {
>   containerBlockID
> {     containerID: 2     localID: 103129366531867089   }
>   blockCommitSequenceId: 4978
> }
> offset: 0
> length: 5242880
> createVersion: 0
> pipeline {
>   leaderID: ""
>   members {
>     uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"
>     ipAddress: "9.134.51.232"
>     hostName: "9.134.51.232"
>     ports
> {       name: "RATIS"       value: 9858     }
>     ports
> {       name: "STANDALONE"       value: 9859     }
>     networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"
>     networkLocation: "/default-rack"
>   }
>   members {
>     uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"
>     ipAddress: "9.134.51.25"
>     hostName: "9.134.51.25"
>     ports
> {       name: "RATIS"       value: 9858     }
>     ports
> {       name: "STANDALONE"       value: 9859     }
>     networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"
>     networkLocation: "/default-rack"
>   }
>   members {
>     uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"
>     ipAddress: "9.134.51.215"
>     hostName: "9.134.51.215"
>     ports
> {       name: "RATIS"       value: 9858     }
>     ports
> {       name: "STANDALONE"       value: 9859     }
>     networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"
>     networkLocation: "/default-rack"
>   }
>   state: PIPELINE_OPEN
>   type: RATIS
>   factor: THREE
>   id
> {     id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"   }
> }
> ]} | ret=SUCCESS |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2470?focusedWorklogId=342808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342808
 ]

ASF GitHub Bot logged work on HDDS-2470:


Author: ASF GitHub Bot
Created on: 13/Nov/19 19:20
Start Date: 13/Nov/19 19:20
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #155: 
HDDS-2470. Add partName, partNumber for CommitMultipartUpload.
URL: https://github.com/apache/hadoop-ozone/pull/155
 
 
   ## What changes were proposed in this pull request?
   
   Add partName and partName to audit log when logging commitMultipartUpload 
request.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2470
   
   ## How was this patch tested?
   
   Just a log change.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 342808)
Remaining Estimate: 0h
Time Spent: 10m

> Add partName, partNumber for CommitMultipartUpload
> --
>
> Key: HDDS-2470
> URL: https://issues.apache.org/jira/browse/HDDS-2470
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now when complete Multipart Upload is not printing partName and  
> partNumber into the audit log. This will help in analyzing audit logs for MPU.
>  
>  
> 2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
> op=COMMIT_MULTIPART_UPLOAD_PARTKEY 
> {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
> key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, 
> replicationFactor=ONE, keyLocationInfo=[blockID {
>   containerBlockID
> {     containerID: 2     localID: 103129366531867089   }
>   blockCommitSequenceId: 4978
> }
> offset: 0
> length: 5242880
> createVersion: 0
> pipeline {
>   leaderID: ""
>   members {
>     uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"
>     ipAddress: "9.134.51.232"
>     hostName: "9.134.51.232"
>     ports
> {       name: "RATIS"       value: 9858     }
>     ports
> {       name: "STANDALONE"       value: 9859     }
>     networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"
>     networkLocation: "/default-rack"
>   }
>   members {
>     uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"
>     ipAddress: "9.134.51.25"
>     hostName: "9.134.51.25"
>     ports
> {       name: "RATIS"       value: 9858     }
>     ports
> {       name: "STANDALONE"       value: 9859     }
>     networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"
>     networkLocation: "/default-rack"
>   }
>   members {
>     uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"
>     ipAddress: "9.134.51.215"
>     hostName: "9.134.51.215"
>     ports
> {       name: "RATIS"       value: 9858     }
>     ports
> {       name: "STANDALONE"       value: 9859     }
>     networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"
>     networkLocation: "/default-rack"
>   }
>   state: PIPELINE_OPEN
>   type: RATIS
>   factor: THREE
>   id
> {     id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"   }
> }
> ]} | ret=SUCCESS |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-11-13 Thread Jira



[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973613#comment-16973613
 ] 

Íñigo Goiri commented on HDFS-14983:


It might be missing in the dfsrouteradmin though.
You could use dfsadmin pointing to the router but it might be good to have it 
in dfsrouteradmin itself.

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Priority: Minor
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-11-13 Thread Jira



[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973612#comment-16973612
 ] 

Íñigo Goiri commented on HDFS-14983:


Isn't this done by HDFS-14545?

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Priority: Minor
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14955) RBF: getQuotaUsage() on mount point should return global quota.

2019-11-13 Thread Jira



[ 
https://issues.apache.org/jira/browse/HDFS-14955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973609#comment-16973609
 ] 

Íñigo Goiri commented on HDFS-14955:


The failed unit test is not related and it will be fixed by HDFS-14974.
+1 on  [^HDFS-14955.002.patch].
[~ayushtkn] do you mind taking a look?

> RBF: getQuotaUsage() on mount point should return global quota.
> ---
>
> Key: HDFS-14955
> URL: https://issues.apache.org/jira/browse/HDFS-14955
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-14955.001.patch, HDFS-14955.002.patch
>
>
> When getQuotaUsage() on a mount point path, the quota part should be the 
> global quota. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14974) RBF: Make tests use free ports

2019-11-13 Thread Jira



[ 
https://issues.apache.org/jira/browse/HDFS-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973608#comment-16973608
 ] 

Íñigo Goiri commented on HDFS-14974:


HDFS-14955 failed in TestRBFMetrics because of an issue that would be resolved 
with  [^HDFS-14974.000.patch].
Any concern with the current approach?

> RBF: Make tests use free ports
> --
>
> Key: HDFS-14974
> URL: https://issues.apache.org/jira/browse/HDFS-14974
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-14974.000.patch
>
>
> Currently, {{TestRouterSecurityManager#testCreateCredentials}} create a 
> Router with the default ports. However, these ports might be used. We should 
> set it to :0 for it to be assigned dynamically.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14924) RenameSnapshot not updating new modification time

2019-11-13 Thread Jira



[ 
https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973606#comment-16973606
 ] 

Íñigo Goiri commented on HDFS-14924:


This seems familiar.
Can you link the other JIRA?

> RenameSnapshot not updating new modification time
> -
>
> Key: HDFS-14924
> URL: https://issues.apache.org/jira/browse/HDFS-14924
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14924.001.patch
>
>
> RenameSnapshot doesnt updating modification time



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2470:
-
Description: 
Right now when complete Multipart Upload is not printing partName and  
partNumber into the audit log. This will help in analyzing audit logs for MPU.

 

 

2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, 
bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, 
replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID {

  containerBlockID

{     containerID: 2     localID: 103129366531867089   }

  blockCommitSequenceId: 4978

}

offset: 0

length: 5242880

createVersion: 0

pipeline {

  leaderID: ""

  members {

    uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    ipAddress: "9.134.51.232"

    hostName: "9.134.51.232"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    ipAddress: "9.134.51.25"

    hostName: "9.134.51.25"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    ipAddress: "9.134.51.215"

    hostName: "9.134.51.215"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    networkLocation: "/default-rack"

  }

  state: PIPELINE_OPEN

  type: RATIS

  factor: THREE

  id

{     id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"   }

}

]} | ret=SUCCESS |

  was:
Right now when complete Multipart Upload is not printing partName and  
partNumber into the audit log.

 

 

2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, 
bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, 
replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID {

  containerBlockID {

    containerID: 2

    localID: 103129366531867089

  }

  blockCommitSequenceId: 4978

}

offset: 0

length: 5242880

createVersion: 0

pipeline {

  leaderID: ""

  members {

    uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    ipAddress: "9.134.51.232"

    hostName: "9.134.51.232"

    ports {

      name: "RATIS"

      value: 9858

    }

    ports {

      name: "STANDALONE"

      value: 9859

    }

    networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    ipAddress: "9.134.51.25"

    hostName: "9.134.51.25"

    ports {

      name: "RATIS"

      value: 9858

    }

    ports {

      name: "STANDALONE"

      value: 9859

    }

    networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    ipAddress: "9.134.51.215"

    hostName: "9.134.51.215"

    ports {

      name: "RATIS"

      value: 9858

    }

    ports {

      name: "STANDALONE"

      value: 9859

    }

    networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    networkLocation: "/default-rack"

  }

  state: PIPELINE_OPEN

  type: RATIS

  factor: THREE

  id {

    id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"

  }

}

]} | ret=SUCCESS |


> Add partName, partNumber for CommitMultipartUpload
> --
>
> Key: HDDS-2470
> URL: https://issues.apache.org/jira/browse/HDDS-2470
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
> Right now when complete Multipart Upload is not printing partName and  
> partNumber into the audit log. This will help in analyzing audit logs for MPU.
>  
>  
> 2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
> op=COMMIT_MULTIPART_UPLOAD_PARTKEY 
> {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
> key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, 
> replicationFactor=ONE, keyLocationInfo=[blockID {
>   containerBlockID
> {     containerID: 2     localID: 103129366531867089   }
>   blockCommitSequenceId: 4978
> }
> offset: 0
> length: 5242880
> createVersion: 0
> pipeline {
>   leaderID: ""
>   members {
>     uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"
>     ipAddress: "9.134.51.232"
>

[jira] [Created] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)

Bharat Viswanadham created HDDS-2470:


 Summary: Add partName, partNumber for CommitMultipartUpload
 Key: HDDS-2470
 URL: https://issues.apache.org/jira/browse/HDDS-2470
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


Right now when complete Multipart Upload is not printing partName and  
partNumber into the audit log.

 

 

2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, 
bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, 
replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID {

  containerBlockID {

    containerID: 2

    localID: 103129366531867089

  }

  blockCommitSequenceId: 4978

}

offset: 0

length: 5242880

createVersion: 0

pipeline {

  leaderID: ""

  members {

    uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    ipAddress: "9.134.51.232"

    hostName: "9.134.51.232"

    ports {

      name: "RATIS"

      value: 9858

    }

    ports {

      name: "STANDALONE"

      value: 9859

    }

    networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    ipAddress: "9.134.51.25"

    hostName: "9.134.51.25"

    ports {

      name: "RATIS"

      value: 9858

    }

    ports {

      name: "STANDALONE"

      value: 9859

    }

    networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    ipAddress: "9.134.51.215"

    hostName: "9.134.51.215"

    ports {

      name: "RATIS"

      value: 9858

    }

    ports {

      name: "STANDALONE"

      value: 9859

    }

    networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    networkLocation: "/default-rack"

  }

  state: PIPELINE_OPEN

  type: RATIS

  factor: THREE

  id {

    id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"

  }

}

]} | ret=SUCCESS |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2469) Avoid changing client-side key metadata

2019-11-13 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HDDS-2469:
---
Status: Patch Available  (was: Open)

> Avoid changing client-side key metadata
> ---
>
> Key: HDDS-2469
> URL: https://issues.apache.org/jira/browse/HDDS-2469
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ozone RPC client should not change input map from client while creating keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2469) Avoid changing client-side key metadata

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2469:
-
Labels: pull-request-available  (was: )

> Avoid changing client-side key metadata
> ---
>
> Key: HDDS-2469
> URL: https://issues.apache.org/jira/browse/HDDS-2469
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>  Labels: pull-request-available
>
> Ozone RPC client should not change input map from client while creating keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2469) Avoid changing client-side key metadata

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2469?focusedWorklogId=342778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342778
 ]

ASF GitHub Bot logged work on HDDS-2469:


Author: ASF GitHub Bot
Created on: 13/Nov/19 18:44
Start Date: 13/Nov/19 18:44
Worklog Time Spent: 10m 
  Work Description: adoroszlai commented on pull request #154: HDDS-2469. 
Avoid changing client-side key metadata
URL: https://github.com/apache/hadoop-ozone/pull/154
 
 
   ## What changes were proposed in this pull request?
   
   Let OM `RpcClient` add extra metadata keys only into the request to OM, not 
to the input from client.
   
   https://issues.apache.org/jira/browse/HDDS-2469
   
   ## How was this patch tested?
   
   Tweaked integration tests:
   
   ```
   Tests run: 69, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 26.154 s - 
in org.apache.hadoop.ozone.client.rpc.TestOzoneRpcClient
   Tests run: 70, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 27.976 s - 
in org.apache.hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis
   Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.035 s - in 
org.apache.hadoop.ozone.client.rpc.TestOzoneRpcClientForAclAuditLog
   ```
   
   Also ran `ozone` smoketest, which includes GDPR test:
   
   ```
   ozone-gdpr :: Smoketest Ozone GDPR Feature
   
==
   Test GDPR disabled| PASS 
|
   
--
   Test GDPR --enforcegdpr=true  | PASS 
|
   
--
   Test GDPR -g=true | PASS 
|
   
--
   Test GDPR -g=false| PASS 
|
   
--
   ozone-gdpr :: Smoketest Ozone GDPR Feature| PASS 
|
   4 critical tests, 4 passed, 0 failed
   4 tests total, 4 passed, 0 failed
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 342778)
Remaining Estimate: 0h
Time Spent: 10m

> Avoid changing client-side key metadata
> ---
>
> Key: HDDS-2469
> URL: https://issues.apache.org/jira/browse/HDDS-2469
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ozone RPC client should not change input map from client while creating keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2469) Avoid changing client-side key metadata

2019-11-13 Thread Attila Doroszlai (Jira)

Attila Doroszlai created HDDS-2469:
--

 Summary: Avoid changing client-side key metadata
 Key: HDDS-2469
 URL: https://issues.apache.org/jira/browse/HDDS-2469
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Client
Reporter: Attila Doroszlai
Assignee: Attila Doroszlai


Ozone RPC client should not change input map from client while creating keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2467) Allow running Freon validators with limited memory

2019-11-13 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HDDS-2467:
---
Status: Patch Available  (was: In Progress)

> Allow running Freon validators with limited memory
> --
>
> Key: HDDS-2467
> URL: https://issues.apache.org/jira/browse/HDDS-2467
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: freon
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Freon validators read each item to be validated completely into a {{byte[]}} 
> buffer.  This allows timing only the read (and buffer allocation), but not 
> the subsequent digest calculation.  However, it also means that memory 
> required for running the validators is proportional to key size.
> I propose to add a command-line flag to allow calculating the digest while 
> reading the input stream.  This changes timing results a bit, since values 
> will include the time required for digest calculation.  On the other hand, 
> Freon will be able to validate huge keys with limited memory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2458) Avoid list copy in ChecksumData

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2458?focusedWorklogId=342730=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342730
 ]

ASF GitHub Bot logged work on HDDS-2458:


Author: ASF GitHub Bot
Created on: 13/Nov/19 17:19
Start Date: 13/Nov/19 17:19
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #141: 
HDDS-2458. Avoid list copy in ChecksumData
URL: https://github.com/apache/hadoop-ozone/pull/141
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 342730)
Time Spent: 20m  (was: 10m)

> Avoid list copy in ChecksumData
> ---
>
> Key: HDDS-2458
> URL: https://issues.apache.org/jira/browse/HDDS-2458
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{ChecksumData}} is initially created with empty list of checksums, then it 
> is updated with computed checksums, copying the list.  The computed list can 
> be set directly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2458) Avoid list copy in ChecksumData

2019-11-13 Thread Nanda kumar (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-2458:
--
Fix Version/s: 0.5.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Avoid list copy in ChecksumData
> ---
>
> Key: HDDS-2458
> URL: https://issues.apache.org/jira/browse/HDDS-2458
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{ChecksumData}} is initially created with empty list of checksums, then it 
> is updated with computed checksums, copying the list.  The computed list can 
> be set directly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14969) Fix HDFS client unnecessary failover log printing

2019-11-13 Thread Erik Krogen (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973543#comment-16973543
 ] 

Erik Krogen commented on HDFS-14969:


I agree that adjusting the logging depending on how many NNs are configured is 
the better approach. Your proposal seems reasonable to me.

> Fix HDFS client unnecessary failover log printing
> -
>
> Key: HDFS-14969
> URL: https://issues.apache.org/jira/browse/HDFS-14969
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenario, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2464) Avoid unnecessary allocations for FileChannel.open call

2019-11-13 Thread Nanda kumar (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-2464:
--
Fix Version/s: 0.5.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Avoid unnecessary allocations for FileChannel.open call
> ---
>
> Key: HDDS-2464
> URL: https://issues.apache.org/jira/browse/HDDS-2464
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{ChunkUtils}} calls {{FileChannel#open(Path, OpenOption...)}}.  Vararg array 
> elements are then added to a new {{HashSet}} to call {{FileChannel#open(Path, 
> Set, FileAttribute...)}}.  We can call the latter 
> directly instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2464) Avoid unnecessary allocations for FileChannel.open call

2019-11-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2464?focusedWorklogId=342728=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342728
 ]

ASF GitHub Bot logged work on HDDS-2464:


Author: ASF GitHub Bot
Created on: 13/Nov/19 17:11
Start Date: 13/Nov/19 17:11
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #147: 
HDDS-2464. Avoid unnecessary allocations for FileChannel.open call
URL: https://github.com/apache/hadoop-ozone/pull/147
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 342728)
Time Spent: 20m  (was: 10m)

> Avoid unnecessary allocations for FileChannel.open call
> ---
>
> Key: HDDS-2464
> URL: https://issues.apache.org/jira/browse/HDDS-2464
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{ChunkUtils}} calls {{FileChannel#open(Path, OpenOption...)}}.  Vararg array 
> elements are then added to a new {{HashSet}} to call {{FileChannel#open(Path, 
> Set, FileAttribute...)}}.  We can call the latter 
> directly instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14924) RenameSnapshot not updating new modification time

2019-11-13 Thread hemanthboyina (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-14924:
-
Attachment: HDFS-14924.001.patch

> RenameSnapshot not updating new modification time
> -
>
> Key: HDFS-14924
> URL: https://issues.apache.org/jira/browse/HDFS-14924
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14924.001.patch
>
>
> RenameSnapshot doesnt updating modification time



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 185 matches

Mail list logo