date:20190930



[ 
https://issues.apache.org/jira/browse/HDFS-14885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941542#comment-16941542
 ] 

Hadoop QA commented on HDFS-14885:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
36m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 42s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:efed4450bf1 |
| JIRA Issue | HDFS-14885 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12981848/HDFS-14885.patch |
| Optional Tests |  dupname  asflicense  shadedclient  |
| uname | Linux 459dc065a713 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 137546a |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 342 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27990/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> UI: Fix a typo on WebUI of DataNode.
> 
>
> Key: HDFS-14885
> URL: https://issues.apache.org/jira/browse/HDFS-14885
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: HDFS-14885.patch, Screen Shot 2019-10-01 at 12.40.29.png
>
>
> A Period('.') should be added to the end of following sentence on WebUI of 
> DataNode.
> "No nodes are decommissioning" 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14373) EC : Decoding is failing when block group last incomplete cell fall in to AlignedStripe

2019-09-30 Thread Surendra Singh Lilhore (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-14373:
--
Component/s: ec
   Priority: Critical  (was: Major)

> EC : Decoding is failing when block group last incomplete cell fall in to 
> AlignedStripe
> ---
>
> Key: HDFS-14373
> URL: https://issues.apache.org/jira/browse/HDFS-14373
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, hdfs-client
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Critical
> Attachments: HDFS-14373.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14885) UI: Fix a typo on WebUI of DataNode.

2019-09-30 Thread Xieming Li (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14885:
--
Attachment: HDFS-14885.patch
Status: Patch Available  (was: Open)

> UI: Fix a typo on WebUI of DataNode.
> 
>
> Key: HDFS-14885
> URL: https://issues.apache.org/jira/browse/HDFS-14885
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: HDFS-14885.patch, Screen Shot 2019-10-01 at 12.40.29.png
>
>
> A Period('.') should be added to the end of following sentence on WebUI of 
> DataNode.
> "No nodes are decommissioning" 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-2001) Update Ratis version to 0.4.0

2019-09-30 Thread Nanda kumar (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941516#comment-16941516
 ] 

Nanda kumar commented on HDDS-2001:
---

The change in ozone-0.4.1 branch is done as part of HDDS-2020

> Update Ratis version to 0.4.0
> -
>
> Key: HDDS-2001
> URL: https://issues.apache.org/jira/browse/HDDS-2001
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Update Ratis version to 0.4.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-2001) Update Ratis version to 0.4.0

2019-09-30 Thread Nanda kumar (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar resolved HDDS-2001.
---
Resolution: Fixed

> Update Ratis version to 0.4.0
> -
>
> Key: HDDS-2001
> URL: https://issues.apache.org/jira/browse/HDDS-2001
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Update Ratis version to 0.4.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14885) UI: Fix a typo on WebUI of DataNode.

2019-09-30 Thread Xieming Li (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14885:
--
Summary: UI: Fix a typo on WebUI of DataNode.  (was: UI: Fix a typo in )

> UI: Fix a typo on WebUI of DataNode.
> 
>
> Key: HDFS-14885
> URL: https://issues.apache.org/jira/browse/HDFS-14885
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: Screen Shot 2019-10-01 at 12.40.29.png
>
>
> A Period('.') should be added to the end of following sentence on WebUI of 
> DataNode.
> "No nodes are decommissioning" 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14885) UI: Fix a typo in

2019-09-30 Thread Xieming Li (Jira)

Xieming Li created HDFS-14885:
-

 Summary: UI: Fix a typo in 
 Key: HDFS-14885
 URL: https://issues.apache.org/jira/browse/HDFS-14885
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, ui
Reporter: Xieming Li
Assignee: Xieming Li
 Attachments: Screen Shot 2019-10-01 at 12.40.29.png

A Period('.') should be added to the end of following sentence on WebUI of 
DataNode.

"No nodes are decommissioning" 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis



[ 
https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941498#comment-16941498
 ] 

Hadoop QA commented on HDDS-2169:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
31s{color} | {color:red} hadoop-hdds in trunk failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
34s{color} | {color:red} hadoop-ozone in trunk failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-hdds in trunk failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
13s{color} | {color:red} hadoop-ozone in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
18s{color} | {color:red} hadoop-hdds in trunk failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
16s{color} | {color:red} hadoop-ozone in trunk failed. {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 17m 
38s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
31s{color} | {color:red} hadoop-hdds in trunk failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
17s{color} | {color:red} hadoop-ozone in trunk failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
33s{color} | {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
36s{color} | {color:red} hadoop-ozone in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
16s{color} | {color:red} hadoop-ozone in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 21s{color} 
| {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 17s{color} 
| {color:red} hadoop-ozone in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 59s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
16s{color} | {color:red} hadoop-ozone in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
28s{color} | {color:red} hadoop-hdds in the patch failed. {color} |
|

[jira] [Work logged] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis



 [ 
https://issues.apache.org/jira/browse/HDDS-2169?focusedWorklogId=321044=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321044
 ]

ASF GitHub Bot logged work on HDDS-2169:


Author: ASF GitHub Bot
Created on: 01/Oct/19 03:31
Start Date: 01/Oct/19 03:31
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1517: HDDS-2169
URL: https://github.com/apache/hadoop/pull/1517#issuecomment-536845581
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 96 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 20 | Maven dependency ordering for branch |
   | -1 | mvninstall | 31 | hadoop-hdds in trunk failed. |
   | -1 | mvninstall | 34 | hadoop-ozone in trunk failed. |
   | -1 | compile | 19 | hadoop-hdds in trunk failed. |
   | -1 | compile | 13 | hadoop-ozone in trunk failed. |
   | +1 | checkstyle | 59 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 971 | branch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 18 | hadoop-hdds in trunk failed. |
   | -1 | javadoc | 16 | hadoop-ozone in trunk failed. |
   | 0 | spotbugs | 1058 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | -1 | findbugs | 31 | hadoop-hdds in trunk failed. |
   | -1 | findbugs | 17 | hadoop-ozone in trunk failed. |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 14 | Maven dependency ordering for patch |
   | -1 | mvninstall | 33 | hadoop-hdds in the patch failed. |
   | -1 | mvninstall | 36 | hadoop-ozone in the patch failed. |
   | -1 | compile | 21 | hadoop-hdds in the patch failed. |
   | -1 | compile | 16 | hadoop-ozone in the patch failed. |
   | -1 | javac | 21 | hadoop-hdds in the patch failed. |
   | -1 | javac | 17 | hadoop-ozone in the patch failed. |
   | +1 | checkstyle | 54 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 779 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 19 | hadoop-hdds in the patch failed. |
   | -1 | javadoc | 16 | hadoop-ozone in the patch failed. |
   | -1 | findbugs | 28 | hadoop-hdds in the patch failed. |
   | -1 | findbugs | 18 | hadoop-ozone in the patch failed. |
   ||| _ Other Tests _ |
   | -1 | unit | 24 | hadoop-hdds in the patch failed. |
   | -1 | unit | 23 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 29 | The patch does not generate ASF License warnings. |
   | | | 2550 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.0 Server=19.03.0 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1517 |
   | JIRA Issue | HDDS-2169 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux f83947a622e3 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / b3275ab |
   | Default Java | 1.8.0_222 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-mvninstall-hadoop-hdds.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-mvninstall-hadoop-ozone.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-compile-hadoop-hdds.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-compile-hadoop-ozone.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-javadoc-hadoop-hdds.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-javadoc-hadoop-ozone.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-findbugs-hadoop-hdds.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-findbugs-hadoop-ozone.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/patch-mvninstall-hadoop-hdds.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/patch-mvninstall-hadoop-ozone.txt

[jira] [Work logged] (HDDS-1984) Fix listBucket API



 [ 
https://issues.apache.org/jira/browse/HDDS-1984?focusedWorklogId=321037=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321037
 ]

ASF GitHub Bot logged work on HDDS-1984:


Author: ASF GitHub Bot
Created on: 01/Oct/19 02:56
Start Date: 01/Oct/19 02:56
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1555: HDDS-1984. Fix 
listBucket API.
URL: https://github.com/apache/hadoop/pull/1555#issuecomment-536837875
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 40 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 25 | Maven dependency ordering for branch |
   | -1 | mvninstall | 31 | hadoop-hdds in trunk failed. |
   | -1 | mvninstall | 36 | hadoop-ozone in trunk failed. |
   | -1 | compile | 20 | hadoop-hdds in trunk failed. |
   | -1 | compile | 15 | hadoop-ozone in trunk failed. |
   | +1 | checkstyle | 51 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 818 | branch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 25 | hadoop-hdds in trunk failed. |
   | -1 | javadoc | 16 | hadoop-ozone in trunk failed. |
   | 0 | spotbugs | 919 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | -1 | findbugs | 34 | hadoop-hdds in trunk failed. |
   | -1 | findbugs | 22 | hadoop-ozone in trunk failed. |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 28 | Maven dependency ordering for patch |
   | -1 | mvninstall | 37 | hadoop-hdds in the patch failed. |
   | -1 | mvninstall | 39 | hadoop-ozone in the patch failed. |
   | -1 | compile | 25 | hadoop-hdds in the patch failed. |
   | -1 | compile | 16 | hadoop-ozone in the patch failed. |
   | -1 | javac | 25 | hadoop-hdds in the patch failed. |
   | -1 | javac | 16 | hadoop-ozone in the patch failed. |
   | -0 | checkstyle | 25 | hadoop-hdds: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 707 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 21 | hadoop-hdds in the patch failed. |
   | -1 | javadoc | 20 | hadoop-ozone in the patch failed. |
   | -1 | findbugs | 32 | hadoop-hdds in the patch failed. |
   | -1 | findbugs | 20 | hadoop-ozone in the patch failed. |
   ||| _ Other Tests _ |
   | -1 | unit | 30 | hadoop-hdds in the patch failed. |
   | -1 | unit | 28 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 34 | The patch does not generate ASF License warnings. |
   | | | 2360 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1555 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 883182c3dde4 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / b3275ab |
   | Default Java | 1.8.0_222 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-mvninstall-hadoop-hdds.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-mvninstall-hadoop-ozone.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-compile-hadoop-hdds.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-compile-hadoop-ozone.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-javadoc-hadoop-hdds.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-javadoc-hadoop-ozone.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-findbugs-hadoop-hdds.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-findbugs-hadoop-ozone.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/patch-mvninstall-hadoop-hdds.txt
 |
   | mvninstall |

[jira] [Work logged] (HDDS-1984) Fix listBucket API



 [ 
https://issues.apache.org/jira/browse/HDDS-1984?focusedWorklogId=321036=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321036
 ]

ASF GitHub Bot logged work on HDDS-1984:


Author: ASF GitHub Bot
Created on: 01/Oct/19 02:56
Start Date: 01/Oct/19 02:56
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1555: HDDS-1984. Fix 
listBucket API.
URL: https://github.com/apache/hadoop/pull/1555#issuecomment-536837796
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 42 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 1 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 68 | Maven dependency ordering for branch |
   | -1 | mvninstall | 44 | hadoop-hdds in trunk failed. |
   | -1 | mvninstall | 39 | hadoop-ozone in trunk failed. |
   | -1 | compile | 21 | hadoop-hdds in trunk failed. |
   | -1 | compile | 14 | hadoop-ozone in trunk failed. |
   | +1 | checkstyle | 62 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 851 | branch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 21 | hadoop-hdds in trunk failed. |
   | -1 | javadoc | 18 | hadoop-ozone in trunk failed. |
   | 0 | spotbugs | 955 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | -1 | findbugs | 38 | hadoop-hdds in trunk failed. |
   | -1 | findbugs | 22 | hadoop-ozone in trunk failed. |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 30 | Maven dependency ordering for patch |
   | -1 | mvninstall | 36 | hadoop-hdds in the patch failed. |
   | -1 | mvninstall | 40 | hadoop-ozone in the patch failed. |
   | -1 | compile | 26 | hadoop-hdds in the patch failed. |
   | -1 | compile | 18 | hadoop-ozone in the patch failed. |
   | -1 | javac | 26 | hadoop-hdds in the patch failed. |
   | -1 | javac | 18 | hadoop-ozone in the patch failed. |
   | -0 | checkstyle | 28 | hadoop-hdds: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 715 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 24 | hadoop-hdds in the patch failed. |
   | -1 | javadoc | 18 | hadoop-ozone in the patch failed. |
   | -1 | findbugs | 32 | hadoop-hdds in the patch failed. |
   | -1 | findbugs | 20 | hadoop-ozone in the patch failed. |
   ||| _ Other Tests _ |
   | -1 | unit | 28 | hadoop-hdds in the patch failed. |
   | -1 | unit | 29 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 35 | The patch does not generate ASF License warnings. |
   | | | 2467 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1555 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 08929fca86df 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / b3275ab |
   | Default Java | 1.8.0_222 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-mvninstall-hadoop-hdds.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-mvninstall-hadoop-ozone.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-compile-hadoop-hdds.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-compile-hadoop-ozone.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-javadoc-hadoop-hdds.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-javadoc-hadoop-ozone.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-findbugs-hadoop-hdds.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-findbugs-hadoop-ozone.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/patch-mvninstall-hadoop-hdds.txt
 |
   | mvninstall |

[jira] [Work logged] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis



 [ 
https://issues.apache.org/jira/browse/HDDS-2169?focusedWorklogId=321035=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321035
 ]

ASF GitHub Bot logged work on HDDS-2169:


Author: ASF GitHub Bot
Created on: 01/Oct/19 02:50
Start Date: 01/Oct/19 02:50
Worklog Time Spent: 10m 
  Work Description: szetszwo commented on issue #1517: HDDS-2169
URL: https://github.com/apache/hadoop/pull/1517#issuecomment-536836755
 
 
   > Thanks @szetszwo for working on this. With the patch, while running the 
tests in TestDataValidateWithUnsafeByteOperations, the below issue is observed.
   > 
   > `2019-09-30 21:58:02,745 [grpc-default-executor-2] ERROR 
segmented.SegmentedRaftLogWorker (SegmentedRaftLogWorker.java:(449)) - 
e4ab8454-30fe-420c-a1cf-40d223cb4898@group-D0335C23E8DA-SegmentedRaftLogWorker: 
writeStateMachineData failed for index 1, entry=(t:1, i:1), 
STATEMACHINELOGENTRY, client-6C45A0D09519, cid=8 
java.lang.IndexOutOfBoundsException: End index: 135008824 >= 207 at 
org.apache.ratis.thirdparty.com.google.protobuf.ByteString.checkRange(ByteString.java:1233)
 at 
org.apache.ratis.thirdparty.com.google.protobuf.ByteString$LiteralByteString.substring(ByteString.java:1288)
 at 
org.apache.hadoop.hdds.ratis.ContainerCommandRequestMessage.toProto(ContainerCommandRequestMessage.java:66)
 at 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.getContainerCommandRequestProto(ContainerStateMachine.java:375)
 at 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.writeStateMachineData(ContainerStateMachine.java:494)
 at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker$WriteLog.(SegmentedRaftLogWorker.java:447)
 at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.writeLogEntry(SegmentedRaftLogWorker.java:397)
 at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendEntryImpl(SegmentedRaftLog.java:411)
 at 
org.apache.ratis.server.raftlog.RaftLog.lambda$appendEntry$10(RaftLog.java:359) 
at 
org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:77)
 at org.apache.ratis.server.raftlog.RaftLog.appendEntry(RaftLog.java:359) at 
org.apache.ratis.server.raftlog.RaftLog.appendImpl(RaftLog.java:183) at 
org.apache.ratis.server.raftlog.RaftLog.lambda$append$2(RaftLog.java:159) at 
org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:68)
 at org.apache.ratis.server.raftlog.RaftLog.append(RaftLog.java:159) at 
org.apache.ratis.server.impl.ServerState.appendLog(ServerState.java:282) at 
org.apache.ratis.server.impl.RaftServerImpl.appendTransaction(RaftServerImpl.java:505)
 at 
org.apache.ratis.server.impl.RaftServerImpl.submitClientRequestAsync(RaftServerImpl.java:576)
 at 
org.apache.ratis.server.impl.RaftServerProxy.lambda$submitClientRequestAsync$7(RaftServerProxy.java:333)
 at 
org.apache.ratis.server.impl.RaftServerProxy.lambda$null$5(RaftServerProxy.java:328)
 at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:109) at 
org.apache.ratis.server.impl.RaftServerProxy.lambda$submitRequest$6(RaftServerProxy.java:328)
 at 
java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:981)
 at 
java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2124) 
at 
org.apache.ratis.server.impl.RaftServerProxy.submitRequest(RaftServerProxy.java:327)
 at 
org.apache.ratis.server.impl.RaftServerProxy.submitClientRequestAsync(RaftServerProxy.java:333)
 at 
org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:220)
 at 
org.apache.ratis.grpc.client.GrpcClientProtocolService$OrderedRequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:326)
 at 
org.apache.ratis.util.SlidingWindow$Server.processRequestsFromHead(SlidingWindow.java:429)
 at 
org.apache.ratis.util.SlidingWindow$Server.receivedRequest(SlidingWindow.java:421)
 at 
org.apache.ratis.grpc.client.GrpcClientProtocolService$OrderedRequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:345)
 at 
org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.onNext(GrpcClientProtocolService.java:240)
 at 
org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.onNext(GrpcClientProtocolService.java:168)
 at 
org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:686)
 at

[jira] [Commented] (HDDS-2175) Propagate System Exceptions from the OzoneManager



[ 
https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941476#comment-16941476
 ] 

Anu Engineer commented on HDDS-2175:


It is something that I disagree with. But if you feel strongly about this; 
please go ahead.

> Propagate System Exceptions from the OzoneManager
> -
>
> Key: HDDS-2175
> URL: https://issues.apache.org/jira/browse/HDDS-2175
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Manager
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Exceptions encountered while processing requests on the OM are categorized as 
> business exceptions and system exceptions. All of the business exceptions are 
> captured as OMException and have an associated status code which is returned 
> to the client. The handling of these is not going to be changed.
> Currently system exceptions are returned as INTERNAL ERROR to the client with 
> a 1 line message string from the exception. The scope of this jira is to 
> capture system exceptions and propagate the related information(including the 
> complete stack trace) back to the client.
> There are 3 sub-tasks required to achieve this
> 1. Separate capture and handling for OMException and the other 
> exceptions(IOException). For system exceptions, use Hadoop IPC 
> ServiceException mechanism to send the stack trace to the client.
> 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and 
> propagate up to the OzoneManager layer (on the leader). Currently, these 
> exceptions are not being tracked.
> 3. Handle and propagate exceptions from Ratis.
> Will raise jira for each sub-task.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1984) Fix listBucket API

2019-09-30 Thread Bharat Viswanadham (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941471#comment-16941471
 ] 

Bharat Viswanadham commented on HDDS-1984:
--

Just posted an initial PR, still thinking about it how can I improve this. (As 
with posted patch, every time I will iterate entire map) Posted a starter 
patch. (Will need further look in to, how can I improve further)

> Fix listBucket API
> --
>
> Key: HDDS-1984
> URL: https://issues.apache.org/jira/browse/HDDS-1984
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This Jira is to fix listBucket API in HA code path.
> In HA, we have an in-memory cache, where we put the result to in-memory cache 
> and return the response, later it will be picked by double buffer thread and 
> it will flush to disk. So, now when do listBuckets, it should use both 
> in-memory cache and rocksdb bucket table to list buckets in a volume.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-1984) Fix listBucket API



 [ 
https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YiSheng Lien reassigned HDDS-1984:
--

Assignee: Bharat Viswanadham

> Fix listBucket API
> --
>
> Key: HDDS-1984
> URL: https://issues.apache.org/jira/browse/HDDS-1984
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This Jira is to fix listBucket API in HA code path.
> In HA, we have an in-memory cache, where we put the result to in-memory cache 
> and return the response, later it will be picked by double buffer thread and 
> it will flush to disk. So, now when do listBuckets, it should use both 
> in-memory cache and rocksdb bucket table to list buckets in a volume.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1984) Fix listBucket API



[ 
https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941469#comment-16941469
 ] 

YiSheng Lien commented on HDDS-1984:


Hello [~bharat]

Thanks for the comment,

Never mind about it,

I can learn more about it with your PR :)

> Fix listBucket API
> --
>
> Key: HDDS-1984
> URL: https://issues.apache.org/jira/browse/HDDS-1984
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This Jira is to fix listBucket API in HA code path.
> In HA, we have an in-memory cache, where we put the result to in-memory cache 
> and return the response, later it will be picked by double buffer thread and 
> it will flush to disk. So, now when do listBuckets, it should use both 
> in-memory cache and rocksdb bucket table to list buckets in a volume.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-1984) Fix listBucket API



 [ 
https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YiSheng Lien reassigned HDDS-1984:
--

Assignee: (was: YiSheng Lien)

> Fix listBucket API
> --
>
> Key: HDDS-1984
> URL: https://issues.apache.org/jira/browse/HDDS-1984
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This Jira is to fix listBucket API in HA code path.
> In HA, we have an in-memory cache, where we put the result to in-memory cache 
> and return the response, later it will be picked by double buffer thread and 
> it will flush to disk. So, now when do listBuckets, it should use both 
> in-memory cache and rocksdb bucket table to list buckets in a volume.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1984) Fix listBucket API



 [ 
https://issues.apache.org/jira/browse/HDDS-1984?focusedWorklogId=321027=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321027
 ]

ASF GitHub Bot logged work on HDDS-1984:


Author: ASF GitHub Bot
Created on: 01/Oct/19 02:14
Start Date: 01/Oct/19 02:14
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1555: 
HDDS-1984. Fix listBucket API.
URL: https://github.com/apache/hadoop/pull/1555
 
 
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 321027)
Remaining Estimate: 0h
Time Spent: 10m

> Fix listBucket API
> --
>
> Key: HDDS-1984
> URL: https://issues.apache.org/jira/browse/HDDS-1984
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: YiSheng Lien
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This Jira is to fix listBucket API in HA code path.
> In HA, we have an in-memory cache, where we put the result to in-memory cache 
> and return the response, later it will be picked by double buffer thread and 
> it will flush to disk. So, now when do listBuckets, it should use both 
> in-memory cache and rocksdb bucket table to list buckets in a volume.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1984) Fix listBucket API



 [ 
https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-1984:
-
Labels: pull-request-available  (was: )

> Fix listBucket API
> --
>
> Key: HDDS-1984
> URL: https://issues.apache.org/jira/browse/HDDS-1984
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: YiSheng Lien
>Priority: Major
>  Labels: pull-request-available
>
> This Jira is to fix listBucket API in HA code path.
> In HA, we have an in-memory cache, where we put the result to in-memory cache 
> and return the response, later it will be picked by double buffer thread and 
> it will flush to disk. So, now when do listBuckets, it should use both 
> in-memory cache and rocksdb bucket table to list buckets in a volume.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1984) Fix listBucket API

2019-09-30 Thread Bharat Viswanadham (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941466#comment-16941466
 ] 

Bharat Viswanadham commented on HDDS-1984:
--

Hey Hi [~cxorm]

Thanks for taking this up. I have missed that you have assigned this Jira to 
yourself.  

I have started working on this and have a patch for this. Sorry for this. You 
can take up other list APIs which are similar kind of this Jira. 

> Fix listBucket API
> --
>
> Key: HDDS-1984
> URL: https://issues.apache.org/jira/browse/HDDS-1984
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: YiSheng Lien
>Priority: Major
>
> This Jira is to fix listBucket API in HA code path.
> In HA, we have an in-memory cache, where we put the result to in-memory cache 
> and return the response, later it will be picked by double buffer thread and 
> it will flush to disk. So, now when do listBuckets, it should use both 
> in-memory cache and rocksdb bucket table to list buckets in a volume.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-2175) Propagate System Exceptions from the OzoneManager

2019-09-30 Thread Arpit Agarwal (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941456#comment-16941456
 ] 

Arpit Agarwal commented on HDDS-2175:
-

bq. it is hard to parse these exceptions even when they are part of normal log 
files.
And yet these exceptions are a godsend. I would rather see one exception than 
10 obscure log messages since it tells me exactly when something 'exceptional' 
happened and the code path leading to the occurrence.

bq. If we add exceptions to those strings, the human readability of those error 
messages goes down.
The readability goes up. You now actually get a sense for what actually went 
wrong instead of some generic message. 

bq. I had a chat with Supratim Deka and I said that I am all for increasing the 
fidelity of the error codes, that is we can add more error codes if we want to 
fine tune these messages. 
Lot more work with inferior results. Error codes are terrible in layered 
systems [since multiple layers will often wind up translating 
codes|https://twitter.com/Obdurodon/status/1161700056740876289]. The only way 
to maintain full fidelity is add a new error code for every single failure 
path, an impossible task. Instead just present the original exception as it 
happened. This is friendlier for your end users and painless for developers.

bq. I prefer a clear, simple contract between the server and client, I think it 
makes it easier for future clients to be developed more easily.
Exceptions as added here will make development of future clients super easy. 
Since the exception is stringified and propagated over the wire, all the client 
has to do is print the string without any interpretation. The fears seems 
unfounded to me.

> Propagate System Exceptions from the OzoneManager
> -
>
> Key: HDDS-2175
> URL: https://issues.apache.org/jira/browse/HDDS-2175
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Manager
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Exceptions encountered while processing requests on the OM are categorized as 
> business exceptions and system exceptions. All of the business exceptions are 
> captured as OMException and have an associated status code which is returned 
> to the client. The handling of these is not going to be changed.
> Currently system exceptions are returned as INTERNAL ERROR to the client with 
> a 1 line message string from the exception. The scope of this jira is to 
> capture system exceptions and propagate the related information(including the 
> complete stack trace) back to the client.
> There are 3 sub-tasks required to achieve this
> 1. Separate capture and handling for OMException and the other 
> exceptions(IOException). For system exceptions, use Hadoop IPC 
> ServiceException mechanism to send the stack trace to the client.
> 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and 
> propagate up to the OzoneManager layer (on the leader). Currently, these 
> exceptions are not being tracked.
> 3. Handle and propagate exceptions from Ratis.
> Will raise jira for each sub-task.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14814) RBF: RouterQuotaUpdateService supports inherited rule.

2019-09-30 Thread Jinglun (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941451#comment-16941451
 ] 

Jinglun commented on HDFS-14814:


Thank [~elgoiri] for your fast reply ! Agree with your comments, especially the 
first one about setQuota(), it's very reasonable !  Only one question:
{quote}I think that in the loop in getGlobalQuota, you could just do the ifs, 
and not do the if with the break, you will get the same number of comparissons.
{quote}
Do you mean the code below ?
{code:java}
Entry entry = pts.lastEntry();
while (entry != null) {
  String ppath = entry.getKey();
  QuotaUsage quota = entry.getValue();
  if (nQuota == HdfsConstants.QUOTA_RESET) {
nQuota = quota.getQuota();
  }
  if (sQuota == HdfsConstants.QUOTA_RESET) {
sQuota = quota.getSpaceQuota();
  }
  entry = pts.lowerEntry(ppath);
}{code}
In my understood, If I don't break I'll search all the entries even I already 
got the values for nQuota and sQuota. So I want to break to save some 
pts.lowerEntry(ppath). Correct me if i'm wrong. Thanks !

> RBF: RouterQuotaUpdateService supports inherited rule.
> --
>
> Key: HDFS-14814
> URL: https://issues.apache.org/jira/browse/HDFS-14814
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-14814.001.patch, HDFS-14814.002.patch, 
> HDFS-14814.003.patch, HDFS-14814.004.patch, HDFS-14814.005.patch, 
> HDFS-14814.006.patch, HDFS-14814.007.patch, HDFS-14814.008.patch, 
> HDFS-14814.009.patch, HDFS-14814.010.patch
>
>
> I want to add a rule *'The quota should be set the same as the nearest 
> parent'* to Global Quota. Supposing we have the mount table below.
> M1: /dir-a                            ns0->/dir-a     \{nquota=10,squota=20}
> M2: /dir-a/dir-b                 ns1->/dir-b     \{nquota=-1,squota=30}
> M3: /dir-a/dir-b/dir-c       ns2->/dir-c     \{nquota=-1,squota=-1}
> M4: /dir-d                           ns3->/dir-d     \{nquota=-1,squota=-1}
>  
> The quota for the remote locations on the namespaces should be:
>  ns0->/dir-a     \{nquota=10,squota=20}
>  ns1->/dir-b     \{nquota=10,squota=30}
>  ns2->/dir-c      \{nquota=10,squota=30}
>  ns3->/dir-d     \{nquota=-1,squota=-1}
>  
> The quota of the remote location is set the same as the corresponding 
> MountTable, and if there is no quota of the MountTable then the quota is set 
> to the nearest parent MountTable with quota.
>  
> It's easy to implement it. In RouterQuotaUpdateService each time we compute 
> the currentQuotaUsage, we can get the quota info for each MountTable. We can 
> do a
>  check and fix all the MountTable which's quota doesn't match the rule above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes



 [ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-14305:
---
Fix Version/s: (was: 3.1.3)
   (was: 3.2.1)
   (was: 3.0.4)
   3.2.2
   3.1.4
   2.10.0
 Hadoop Flags: Reviewed
 Assignee: Konstantin Shvachko  (was: Xiaoqiao He)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

[~vagarychen] you are absolutely correct, thanks for the review.

I just committed this to trunk, branch-3.2, branch-3.1, and branch-2. Updated 
fix versions.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: Konstantin Shvachko
>Priority: Major
>  Labels: multi-sbnn, release-blocker
> Fix For: 2.10.0, 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, 
> HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, 
> HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1720) Add ability to configure RocksDB logs for Ozone Manager



 [ 
https://issues.apache.org/jira/browse/HDDS-1720?focusedWorklogId=320994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320994
 ]

ASF GitHub Bot logged work on HDDS-1720:


Author: ASF GitHub Bot
Created on: 01/Oct/19 00:22
Start Date: 01/Oct/19 00:22
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on issue #1538: HDDS-1720 : Add 
ability to configure RocksDB logs for Ozone Manager.
URL: https://github.com/apache/hadoop/pull/1538#issuecomment-536804802
 
 
   Can you please add a test case that proves that RocksDB actually produces 
logs that we can see. There is a log listener class in the Hadoop. You can use 
them, or create a test and then grep for some of the log statements. Otherwise 
the change looks quite good to me.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320994)
Time Spent: 1h 10m  (was: 1h)

> Add ability to configure RocksDB logs for Ozone Manager
> ---
>
> Key: HDDS-1720
> URL: https://issues.apache.org/jira/browse/HDDS-1720
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> While doing performance testing, it was seen that there was no way to get 
> RocksDB logs for Ozone Manager. Along with Rocksdb metrics, this may be a 
> useful mechanism to understand the health of Rocksdb while investigating 
> large clusters. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1720) Add ability to configure RocksDB logs for Ozone Manager



 [ 
https://issues.apache.org/jira/browse/HDDS-1720?focusedWorklogId=320993=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320993
 ]

ASF GitHub Bot logged work on HDDS-1720:


Author: ASF GitHub Bot
Created on: 01/Oct/19 00:21
Start Date: 01/Oct/19 00:21
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on issue #1538: HDDS-1720 : Add 
ability to configure RocksDB logs for Ozone Manager.
URL: https://github.com/apache/hadoop/pull/1538#issuecomment-536804802
 
 
   Can you please add a test case that proves that RocksDB actually produces 
logs that we can see. There is a log listener class in the Hadoop. You can use 
them, or create a test and then grep for some of the log statements.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320993)
Time Spent: 1h  (was: 50m)

> Add ability to configure RocksDB logs for Ozone Manager
> ---
>
> Key: HDDS-1720
> URL: https://issues.apache.org/jira/browse/HDDS-1720
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> While doing performance testing, it was seen that there was no way to get 
> RocksDB logs for Ozone Manager. Along with Rocksdb metrics, this may be a 
> useful mechanism to understand the health of Rocksdb while investigating 
> large clusters. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-09-30 Thread Hudson (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941420#comment-16941420
 ] 

Hudson commented on HDFS-14305:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17421 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17421/])
HDFS-14305. Fix serial number calculation in BlockTokenSecretManager to (shv: 
rev b3275ab1f2f4546ba4bdc0e48cfa60b5b05071b9)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailoverWithBlockTokensEnabled.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSecretManager.java


> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: multi-sbnn, release-blocker
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, 
> HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, 
> HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon



[ 
https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941411#comment-16941411
 ] 

Wei-Chiu Chuang commented on HDFS-14235:


Commit applies cleanly in branch-3.1. Updated fix version

> Handle ArrayIndexOutOfBoundsException in 
> DataNodeDiskMetrics#slowDiskDetectionDaemon 
> -
>
> Key: HDFS-14235
> URL: https://issues.apache.org/jira/browse/HDFS-14235
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Surendra Singh Lilhore
>Assignee: Ranith Sardar
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.4
>
> Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, 
> HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png
>
>
> below code throwing exception because {{volumeIterator.next()}} called two 
> time without checking hashNext().
> {code:java}
> while (volumeIterator.hasNext()) {
>   FsVolumeSpi volume = volumeIterator.next();
>   DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics();
>   String volumeName = volume.getBaseURI().getPath();
>   metadataOpStats.put(volumeName,
>   metrics.getMetadataOperationMean());
>   readIoStats.put(volumeName, metrics.getReadIoMean());
>   writeIoStats.put(volumeName, metrics.getWriteIoMean());
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon



 [ 
https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14235:
---
Fix Version/s: 3.1.4

> Handle ArrayIndexOutOfBoundsException in 
> DataNodeDiskMetrics#slowDiskDetectionDaemon 
> -
>
> Key: HDFS-14235
> URL: https://issues.apache.org/jira/browse/HDFS-14235
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Surendra Singh Lilhore
>Assignee: Ranith Sardar
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.4
>
> Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, 
> HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png
>
>
> below code throwing exception because {{volumeIterator.next()}} called two 
> time without checking hashNext().
> {code:java}
> while (volumeIterator.hasNext()) {
>   FsVolumeSpi volume = volumeIterator.next();
>   DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics();
>   String volumeName = volume.getBaseURI().getPath();
>   metadataOpStats.put(volumeName,
>   metrics.getMetadataOperationMean());
>   readIoStats.put(volumeName, metrics.getReadIoMean());
>   writeIoStats.put(volumeName, metrics.getWriteIoMean());
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-2205) checkstyle.sh reports wrong failure count

2019-09-30 Thread Hudson (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941409#comment-16941409
 ] 

Hudson commented on HDDS-2205:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17420 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17420/])
HDDS-2205. checkstyle.sh reports wrong failure count (aengineer: rev 
e5bba592a84a94e0545479b668e6925eb4b8858c)
* (edit) hadoop-ozone/dev-support/checks/checkstyle.sh


> checkstyle.sh reports wrong failure count
> -
>
> Key: HDDS-2205
> URL: https://issues.apache.org/jira/browse/HDDS-2205
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {{checkstyle.sh}} outputs files with checkstyle violations and the violations 
> themselves on separate lines.  It then reports line count as number of 
> failures.
> {code:title=target/checkstyle/summary.txt}
> hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java
>  49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager.
> {code}
> {code:title=target/checkstyle/failures}
> 2
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1615) ManagedChannel references are being leaked in ReplicationSupervisor.java



 [ 
https://issues.apache.org/jira/browse/HDDS-1615?focusedWorklogId=320976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320976
 ]

ASF GitHub Bot logged work on HDDS-1615:


Author: ASF GitHub Bot
Created on: 30/Sep/19 23:40
Start Date: 30/Sep/19 23:40
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on issue #1547: HDDS-1615. 
ManagedChannel references are being leaked in ReplicationS…
URL: https://github.com/apache/hadoop/pull/1547#issuecomment-536795997
 
 
   +1. LGTM.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320976)
Time Spent: 0.5h  (was: 20m)

> ManagedChannel references are being leaked in ReplicationSupervisor.java
> 
>
> Key: HDDS-1615
> URL: https://issues.apache.org/jira/browse/HDDS-1615
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> ManagedChannel references are being leaked in ReplicationSupervisor.java
> {code}
> May 30, 2019 8:10:56 AM 
> org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference
>  cleanQueue
> SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=1495, 
> target=192.168.0.3:49868} was not shutdown properly!!! ~*~*~*
> Make sure to call shutdown()/shutdownNow() and wait until 
> awaitTermination() returns true.
> java.lang.RuntimeException: ManagedChannel allocation site
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.(ManagedChannelOrphanWrapper.java:103)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:53)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:44)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.AbstractManagedChannelImplBuilder.build(AbstractManagedChannelImplBuilder.java:411)
> at 
> org.apache.hadoop.ozone.container.replication.GrpcReplicationClient.(GrpcReplicationClient.java:65)
> at 
> org.apache.hadoop.ozone.container.replication.SimpleContainerDownloader.getContainerDataFromReplicas(SimpleContainerDownloader.java:87)
> at 
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.replicate(DownloadAndImportReplicator.java:118)
> at 
> org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:115)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1720) Add ability to configure RocksDB logs for Ozone Manager



 [ 
https://issues.apache.org/jira/browse/HDDS-1720?focusedWorklogId=320977=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320977
 ]

ASF GitHub Bot logged work on HDDS-1720:


Author: ASF GitHub Bot
Created on: 30/Sep/19 23:40
Start Date: 30/Sep/19 23:40
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on issue #1538: HDDS-1720 : Add 
ability to configure RocksDB logs for Ozone Manager.
URL: https://github.com/apache/hadoop/pull/1538#issuecomment-536796152
 
 
   /retest
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320977)
Time Spent: 50m  (was: 40m)

> Add ability to configure RocksDB logs for Ozone Manager
> ---
>
> Key: HDDS-1720
> URL: https://issues.apache.org/jira/browse/HDDS-1720
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> While doing performance testing, it was seen that there was no way to get 
> RocksDB logs for Ozone Manager. Along with Rocksdb metrics, this may be a 
> useful mechanism to understand the health of Rocksdb while investigating 
> large clusters. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2205) checkstyle.sh reports wrong failure count



 [ 
https://issues.apache.org/jira/browse/HDDS-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-2205:
---
Fix Version/s: 0.5.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> checkstyle.sh reports wrong failure count
> -
>
> Key: HDDS-2205
> URL: https://issues.apache.org/jira/browse/HDDS-2205
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {{checkstyle.sh}} outputs files with checkstyle violations and the violations 
> themselves on separate lines.  It then reports line count as number of 
> failures.
> {code:title=target/checkstyle/summary.txt}
> hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java
>  49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager.
> {code}
> {code:title=target/checkstyle/failures}
> 2
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2205) checkstyle.sh reports wrong failure count



 [ 
https://issues.apache.org/jira/browse/HDDS-2205?focusedWorklogId=320975=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320975
 ]

ASF GitHub Bot logged work on HDDS-2205:


Author: ASF GitHub Bot
Created on: 30/Sep/19 23:37
Start Date: 30/Sep/19 23:37
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1548: HDDS-2205. 
checkstyle.sh reports wrong failure count
URL: https://github.com/apache/hadoop/pull/1548
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320975)
Time Spent: 1h  (was: 50m)

> checkstyle.sh reports wrong failure count
> -
>
> Key: HDDS-2205
> URL: https://issues.apache.org/jira/browse/HDDS-2205
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {{checkstyle.sh}} outputs files with checkstyle violations and the violations 
> themselves on separate lines.  It then reports line count as number of 
> failures.
> {code:title=target/checkstyle/summary.txt}
> hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java
>  49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager.
> {code}
> {code:title=target/checkstyle/failures}
> 2
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2205) checkstyle.sh reports wrong failure count



 [ 
https://issues.apache.org/jira/browse/HDDS-2205?focusedWorklogId=320974=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320974
 ]

ASF GitHub Bot logged work on HDDS-2205:


Author: ASF GitHub Bot
Created on: 30/Sep/19 23:37
Start Date: 30/Sep/19 23:37
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on issue #1548: HDDS-2205. 
checkstyle.sh reports wrong failure count
URL: https://github.com/apache/hadoop/pull/1548#issuecomment-536795581
 
 
   Thank you for the contribution. I have committed this patch to the trunk. 
@dineshchitlangia  Thank you for the review.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320974)
Time Spent: 50m  (was: 40m)

> checkstyle.sh reports wrong failure count
> -
>
> Key: HDDS-2205
> URL: https://issues.apache.org/jira/browse/HDDS-2205
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {{checkstyle.sh}} outputs files with checkstyle violations and the violations 
> themselves on separate lines.  It then reports line count as number of 
> failures.
> {code:title=target/checkstyle/summary.txt}
> hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java
>  49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager.
> {code}
> {code:title=target/checkstyle/failures}
> 2
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-2203) Race condition in ByteStringHelper.init()



[ 
https://issues.apache.org/jira/browse/HDDS-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941398#comment-16941398
 ] 

Anu Engineer commented on HDDS-2203:


makes sense. Do you want this patch committed? or just move to the new model ? 


> Race condition in ByteStringHelper.init()
> -
>
> Key: HDDS-2203
> URL: https://issues.apache.org/jira/browse/HDDS-2203
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client, SCM
>Reporter: Istvan Fajth
>Assignee: Istvan Fajth
>Priority: Critical
>  Labels: pull-request-available, pull-requests-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The current init method:
> {code}
> public static void init(boolean isUnsafeByteOperation) {
>   final boolean set = INITIALIZED.compareAndSet(false, true);
>   if (set) {
> ByteStringHelper.isUnsafeByteOperationsEnabled =
>isUnsafeByteOperation;
>} else {
>  // already initialized, check values
>  Preconditions.checkState(isUnsafeByteOperationsEnabled
>== isUnsafeByteOperation);
>}
> }
> {code}
> In a scenario when two thread accesses this method, and the execution order 
> is the following, then the second thread runs into an exception from 
> PreCondition.checkState() in the else branch.
> In an unitialized state:
> - T1 thread arrives to the method with true as the parameter, the class 
> initialises the isUnsafeByteOperationsEnabled to false
> - T1 sets INITIALIZED true
> - T2 arrives to the method with true as the parameter
> - T2 reads the INITALIZED value and as it is not false goes to else branch
> - T2 tries to check if the internal boolean property is the same true as it 
> wanted to set, and as T1 still to set the value, the checkState throws an 
> IllegalArgumentException.
> This happens in certain Hive query cases, as it came from that testing, the 
> exception we see there is the following:
> {code}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, 
> vertexName=Map 2, vertexId=vertex_1569486223160_0334_1_02, 
> diagnostics=[Vertex vertex_1569486223160_0334_1_02 [Map 2] killed/failed
>  due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: item initializer failed, 
> vertex=vertex_1569486223160_0334_1_02 [Map 2], java.io.IOException: Couldn't 
> create RpcClient protocol
> at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:263)
> at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:239)
> at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:203)
> at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:165)
> at 
> org.apache.hadoop.fs.ozone.BasicOzoneClientAdapterImpl.(BasicOzoneClientAdapterImpl.java:158)
> at 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl.(OzoneClientAdapterImpl.java:50)
> at 
> org.apache.hadoop.fs.ozone.OzoneFileSystem.createAdapter(OzoneFileSystem.java:102)
> at 
> org.apache.hadoop.fs.ozone.BasicOzoneFileSystem.initialize(BasicOzoneFileSystem.java:155)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1821)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:2002)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781)
> at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
>

[jira] [Assigned] (HDDS-2213) Reduce key provider loading log level in OzoneFileSystem#getAdditionalTokenIssuers

2019-09-30 Thread Aravindan Vijayan (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDDS-2213:
---

Assignee: Shweta

> Reduce key provider loading log level in 
> OzoneFileSystem#getAdditionalTokenIssuers
> --
>
> Key: HDDS-2213
> URL: https://issues.apache.org/jira/browse/HDDS-2213
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Shweta
>Priority: Minor
>
> OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client 
> tries to collect ozone delegation token to run MR/Spark jobs but ozone file 
> system does not have a kms provider configured. In this case, we simply 
> return null provider here in the code below. This is a benign error and we 
> should reduce the log level to debug level.
> {code:java}
> KeyProvider keyProvider;
>  try {
>   keyProvider = getKeyProvider(); }
> catch (IOException ioe) {
>   LOG.error("Error retrieving KeyProvider.", ioe);
>   return null;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-09-30 Thread Chen Liang (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941393#comment-16941393
 ] 

Chen Liang commented on HDFS-14305:
---

Looks like the key idea of v8 patch is that calling {{nextInt(int bound)}} 
which gives a non-negative value, instead of {{nextInt()}} which can return 
negative value. So that the range start is never negative, and so we avoid the 
overlapping ranges. 

Assuming we will address the potential confliction issue separately, +1 for the 
v08 patch.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: multi-sbnn, release-blocker
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, 
> HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, 
> HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2213) Reduce key provider loading log level in OzoneFileSystem#getAdditionalTokenIssuers

2019-09-30 Thread Xiaoyu Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-2213:
-
Description: 
OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client tries 
to collect ozone delegation token to run MR/Spark jobs but ozone file system 
does not have a kms provider configured. In this case, we simply return null 
provider here in the code below. This is a benign error and we should reduce 
the log level to debug level.

{code:java}

KeyProvider keyProvider;
 try {

  keyProvider = getKeyProvider(); }

catch (IOException ioe) {

  LOG.error("Error retrieving KeyProvider.", ioe);

  return null;

}
{code}

  was:
OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client tries 
to collect ozone delegation token to run MR/Spark jobs but ozone file system 
does not have a kms provider configured. In this case, we simply return null 
provider here in the code below. This is a benign error and we should reduce 
the log level to debug level.

 \{code}

KeyProvider keyProvider;
try {
 keyProvider = getKeyProvider();
} catch (IOException ioe) {
 LOG.error("Error retrieving KeyProvider.", ioe);
 return null;
}

{code}


> Reduce key provider loading log level in 
> OzoneFileSystem#getAdditionalTokenIssuers
> --
>
> Key: HDDS-2213
> URL: https://issues.apache.org/jira/browse/HDDS-2213
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Vivek Ratnavel Subramanian
>Priority: Minor
>
> OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client 
> tries to collect ozone delegation token to run MR/Spark jobs but ozone file 
> system does not have a kms provider configured. In this case, we simply 
> return null provider here in the code below. This is a benign error and we 
> should reduce the log level to debug level.
> {code:java}
> KeyProvider keyProvider;
>  try {
>   keyProvider = getKeyProvider(); }
> catch (IOException ioe) {
>   LOG.error("Error retrieving KeyProvider.", ioe);
>   return null;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-2175) Propagate System Exceptions from the OzoneManager



[ 
https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941392#comment-16941392
 ] 

Anu Engineer commented on HDDS-2175:


bq. I feel that call stacks are invaluable when included in the bug report to 
the developer.

I completely agree. As I mentioned in my comment in the Github, they are very 
useful tools for debugging. But we have to weigh the pros and cons of the 
approach. Here are some downsides, so I will list them out.

1. Code and Style Consistency - Generally, Errors are propagated via Error code 
and Message (Goland, C, etc) or Exceptions (Java, C++ etc). When we developed 
this interface, we choose to go with Error code and Message approach instead of 
Exceptions. So mixing these different approaches creates very inconsistent code 
flows.

2. Prevent Java server abstractions from leaking to client side - Java 
exceptions are very java specific; it is hard to parse these exceptions even 
when they are part of normal log files. It is difficult to read thru a printed 
stack to even understand the issue. This gets compounded when Exceptions stack. 
When we were writing this client interface, we wanted to make sure it is easy 
to write clients in other languages. A simple, Error code and a message is 
universal, that all languages understand and easy to write other language 
clients which can speak this protocol.

3. The current code experience - There are several parts of this code, where 
the clients print out these messages to the users. If we add exceptions to 
those strings, the human readability of those error messages goes down. 

4. If we want to move to exceptions instead of  error codes , it is possible 
(even though I think our future clients will suffer), but we need to move away 
from the error/message model. That is lot of work,  with very little benefit, 
other than the fact that we will have a consistent experience and exceptions 
will flow to the client side.

I had a chat with [~sdeka] and I said that I am all for increasing the fidelity 
of the error codes, that is we can add more error codes if we want to fine tune 
these messages. I am also all for logging more on the server side. So I am not 
against the patch, just wanted to avoid *server side Java exceptions crossing 
over to the client side*. I prefer a clear, simple contract between the server 
and client, I think it makes it easier for future clients to be developed more 
easily. 

> Propagate System Exceptions from the OzoneManager
> -
>
> Key: HDDS-2175
> URL: https://issues.apache.org/jira/browse/HDDS-2175
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Manager
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Exceptions encountered while processing requests on the OM are categorized as 
> business exceptions and system exceptions. All of the business exceptions are 
> captured as OMException and have an associated status code which is returned 
> to the client. The handling of these is not going to be changed.
> Currently system exceptions are returned as INTERNAL ERROR to the client with 
> a 1 line message string from the exception. The scope of this jira is to 
> capture system exceptions and propagate the related information(including the 
> complete stack trace) back to the client.
> There are 3 sub-tasks required to achieve this
> 1. Separate capture and handling for OMException and the other 
> exceptions(IOException). For system exceptions, use Hadoop IPC 
> ServiceException mechanism to send the stack trace to the client.
> 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and 
> propagate up to the OzoneManager layer (on the leader). Currently, these 
> exceptions are not being tracked.
> 3. Handle and propagate exceptions from Ratis.
> Will raise jira for each sub-task.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-2175) Propagate System Exceptions from the OzoneManager

[
https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941392#comment-16941392
]

Anu Engineer edited comment on HDDS-2175 at 9/30/19 10:54 PM:
--

{quote}I feel that call stacks are invaluable when included in the bug report
to the developer.
{quote}
I completely agree. As I mentioned in my comment in the Github, they are very
useful tools for debugging. But we have to weigh the pros and cons of the
approach. Here are some downsides, so I will list them out.

1. Code and Style Consistency - Generally, Errors are propagated via Error code
and Message (Golang, C, etc) or Exceptions (Java, C++ etc). When we developed
this interface, we choose to go with Error code and Message approach instead of
Exceptions. So mixing these different approaches creates very inconsistent code
flows.

2. Prevent Java server abstractions from leaking to client side - Java
exceptions are very java specific; it is hard to parse these exceptions even
when they are part of normal log files. It is difficult to read thru a printed
stack to even understand the issue. This gets compounded when Exceptions stack.
When we were writing this client interface, we wanted to make sure it is easy
to write clients in other languages. A simple, Error code and a message is
universal, that all languages understand and easy to write other language
clients which can speak this protocol.

3. The current code experience - There are several parts of this code, where
the clients print out these messages to the users. If we add exceptions to
those strings, the human readability of those error messages goes down.

4. If we want to move to exceptions instead of error codes , it is possible
(even though I think our future clients will suffer), but we need to move away
from the error/message model. That is lot of work, with very little benefit,
other than the fact that we will have a consistent experience and exceptions
will flow to the client side.

I had a chat with [~sdeka] and I said that I am all for increasing the fidelity
of the error codes, that is we can add more error codes if we want to fine tune
these messages. I am also all for logging more on the server side. So I am not
against the patch, just wanted to avoid *server side Java exceptions crossing
over to the client side*. I prefer a clear, simple contract between the server
and client, I think it makes it easier for future clients to be developed more
easily.

was (Author: anu):
bq. I feel that call stacks are invaluable when included in the bug report to
the developer.

I completely agree. As I mentioned in my comment in the Github, they are very
useful tools for debugging. But we have to weigh the pros and cons of the
approach. Here are some downsides, so I will list them out.

1. Code and Style Consistency - Generally, Errors are propagated via Error code
and Message (Goland, C, etc) or Exceptions (Java, C++ etc). When we developed
this interface, we choose to go with Error code and Message approach instead of
Exceptions. So mixing these different approaches creates very inconsistent code
flows.

> Propagate System Exceptions from the OzoneManager
> -
>
>

[jira] [Created] (HDDS-2213) Reduce key provider loading log level in OzoneFileSystem#getAdditionalTokenIssuers

2019-09-30 Thread Xiaoyu Yao (Jira)

Xiaoyu Yao created HDDS-2213:


 Summary: Reduce key provider loading log level in 
OzoneFileSystem#getAdditionalTokenIssuers
 Key: HDDS-2213
 URL: https://issues.apache.org/jira/browse/HDDS-2213
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Vivek Ratnavel Subramanian


OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client tries 
to collect ozone delegation token to run MR/Spark jobs but ozone file system 
does not have a kms provider configured. In this case, we simply return null 
provider here in the code below. This is a benign error and we should reduce 
the log level to debug level.

 \{code}

KeyProvider keyProvider;
try {
 keyProvider = getKeyProvider();
} catch (IOException ioe) {
 LOG.error("Error retrieving KeyProvider.", ioe);
 return null;
}

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10648) Expose Balancer metrics through Metrics2



[ 
https://issues.apache.org/jira/browse/HDFS-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941389#comment-16941389
 ] 

Wei-Chiu Chuang commented on HDFS-10648:


Thanks for doing this. Without metrics HDFS-13783 isn't much useful.

> Expose Balancer metrics through Metrics2
> 
>
> Key: HDFS-10648
> URL: https://issues.apache.org/jira/browse/HDFS-10648
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer  mover, metrics
>Reporter: Mark Wagner
>Assignee: Chen Zhang
>Priority: Major
>  Labels: metrics
>
> The Balancer currently prints progress information to the console. For 
> deployments that run the balancer frequently, it would be helpful to collect 
> those metrics for publishing to the available sinks. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941381#comment-16941381
 ] 

Hadoop QA commented on HDFS-14305:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m  
4s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 47s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  5s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}107m 47s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}174m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:efed4450bf1 |
| JIRA Issue | HDFS-14305 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12981828/HDFS-14305-008.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux bc13cb7fa98b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 4d3c580 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27989/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27989/testReport/ |
| Max. process+thread count | 2864 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27989/console |
| Powered

[jira] [Updated] (HDFS-14808) EC: Improper size values for corrupt ec block in LOG



 [ 
https://issues.apache.org/jira/browse/HDFS-14808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14808:
---
Component/s: ec

> EC: Improper size values for corrupt ec block in LOG 
> -
>
> Key: HDFS-14808
> URL: https://issues.apache.org/jira/browse/HDFS-14808
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Reporter: Harshakiran Reddy
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14808-01.patch
>
>
> If the block corruption reason is size mismatch the log. The values shown and 
> compared are ambiguous.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Reopened] (HDFS-7134) Replication count for a block should not update till the blocks have settled on Datanodes



 [ 
https://issues.apache.org/jira/browse/HDFS-7134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reopened HDFS-7134:
---

> Replication count for a block should not update till the blocks have settled 
> on Datanodes
> -
>
> Key: HDFS-7134
> URL: https://issues.apache.org/jira/browse/HDFS-7134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Affects Versions: 1.2.1, 2.6.0, 2.7.3
> Environment: Linux nn1.cluster1.com 2.6.32-431.20.3.el6.x86_64 #1 SMP 
> Thu Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> [hadoop@nn1 conf]$ cat /etc/redhat-release
> CentOS release 6.5 (Final)
>Reporter: gurmukh singh
>Priority: Critical
>  Labels: HDFS
> Fix For: 3.1.0
>
>
> The count for the number of replica's for a block should not change till the 
> blocks have settled on the datanodes.
> Test Case:
> Hadoop Cluster with 1 namenode and 3 datanodes.
> nn1.cluster1.com(192.168.1.70)
> dn1.cluster1.com(192.168.1.72)
> dn2.cluster1.com(192.168.1.73)
> dn3.cluster1.com(192.168.1.74)
> Cluster up and running fine with replication set to "1" for parameter 
> "dfs.replication on all nodes"
> 
> dfs.replication
> 1
> 
> To reduce the wait time, have reduced the dfs.heartbeat and recheck 
> parameters.
> on datanode2 (192.168.1.72)
> [hadoop@dn2 ~]$ hadoop fs -Ddfs.replication=2 -put from_dn2 /
> [hadoop@dn2 ~]$ hadoop fs -ls /from_dn2
> Found 1 items
> -rw-r--r--   2 hadoop supergroup 17 2014-09-23 13:33 /from_dn2
> On Namenode
> ===
> As expected, copy was done from datanode2, one copy will go locally.
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 13:53:16 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, 
> 192.168.1.73:50010]
> Can see the blocks on the data nodes disks as well under the "current" 
> directory.
> Now, shutdown datanode2(192.168.1.73) and as expected block moves to another 
> datanode to maintain a replication of 2
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 13:54:21 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, 
> 192.168.1.72:50010]
> But, now if i bring back the datanode2, and although the namenode see that 
> this block is at 3 places now and fires a invalidate command for 
> datanode1(192.168.1.72) but the replication on the namenode is bumped to 3 
> immediately.
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 13:56:12 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, 
> 192.168.1.72:50010, 192.168.1.73:50010]
> on Datanode1 - The invalidate command has been fired immediately and the 
> block deleted.
> =
> 2014-09-23 13:54:17,483 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Receiving blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: 
> /192.168.1.72:50010
> 2014-09-23 13:54:17,502 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Received blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: 
> /192.168.1.72:50010 size 17
> 2014-09-23 13:55:28,720 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Scheduling blk_8132629811771280764_1175 file 
> /space/disk1/current/blk_8132629811771280764 for deletion
> 2014-09-23 13:55:28,721 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Deleted blk_8132629811771280764_1175 at file 
> /space/disk1/current/blk_8132629811771280764
> The namenode still shows 3 replica's. even if one has been deleted, even 
> after more then 30 mins.
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 14:21:27 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, 
> 192.168.1.72:50010, 192.168.1.73:50010]
> This could be a dangerous, if someone remove or other 2 datanodes fail.
> On Datanode 1
> =
> Before, the datanode1 is brought back
> [hadoop@dn1 conf]$ ls -l /space/disk*/current
> /space/disk1/current:
> total 28
> -rw-rw-r-- 1 hadoop hadoop   13 Sep 21 09:09 blk_2278001646987517832
> -rw-rw-r-- 1 hadoop hadoop   11 Sep 21 09:09 blk_2278001646987517832_1171.meta
> -rw-rw-r-- 1 hadoop hadoop   17 Sep 23 13:54 blk_8132629811771280764
> -rw-rw-r-- 1 hadoop hadoop   11 Sep 23 13:54 blk_8132629811771280764_1175.meta
>

[jira] [Resolved] (HDFS-7134) Replication count for a block should not update till the blocks have settled on Datanodes



 [ 
https://issues.apache.org/jira/browse/HDFS-7134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-7134.
---
Resolution: Cannot Reproduce

Resolve as cannot reproduce.

> Replication count for a block should not update till the blocks have settled 
> on Datanodes
> -
>
> Key: HDFS-7134
> URL: https://issues.apache.org/jira/browse/HDFS-7134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Affects Versions: 1.2.1, 2.6.0, 2.7.3
> Environment: Linux nn1.cluster1.com 2.6.32-431.20.3.el6.x86_64 #1 SMP 
> Thu Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> [hadoop@nn1 conf]$ cat /etc/redhat-release
> CentOS release 6.5 (Final)
>Reporter: gurmukh singh
>Priority: Critical
>  Labels: HDFS
> Fix For: 3.1.0
>
>
> The count for the number of replica's for a block should not change till the 
> blocks have settled on the datanodes.
> Test Case:
> Hadoop Cluster with 1 namenode and 3 datanodes.
> nn1.cluster1.com(192.168.1.70)
> dn1.cluster1.com(192.168.1.72)
> dn2.cluster1.com(192.168.1.73)
> dn3.cluster1.com(192.168.1.74)
> Cluster up and running fine with replication set to "1" for parameter 
> "dfs.replication on all nodes"
> 
> dfs.replication
> 1
> 
> To reduce the wait time, have reduced the dfs.heartbeat and recheck 
> parameters.
> on datanode2 (192.168.1.72)
> [hadoop@dn2 ~]$ hadoop fs -Ddfs.replication=2 -put from_dn2 /
> [hadoop@dn2 ~]$ hadoop fs -ls /from_dn2
> Found 1 items
> -rw-r--r--   2 hadoop supergroup 17 2014-09-23 13:33 /from_dn2
> On Namenode
> ===
> As expected, copy was done from datanode2, one copy will go locally.
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 13:53:16 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, 
> 192.168.1.73:50010]
> Can see the blocks on the data nodes disks as well under the "current" 
> directory.
> Now, shutdown datanode2(192.168.1.73) and as expected block moves to another 
> datanode to maintain a replication of 2
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 13:54:21 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, 
> 192.168.1.72:50010]
> But, now if i bring back the datanode2, and although the namenode see that 
> this block is at 3 places now and fires a invalidate command for 
> datanode1(192.168.1.72) but the replication on the namenode is bumped to 3 
> immediately.
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 13:56:12 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, 
> 192.168.1.72:50010, 192.168.1.73:50010]
> on Datanode1 - The invalidate command has been fired immediately and the 
> block deleted.
> =
> 2014-09-23 13:54:17,483 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Receiving blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: 
> /192.168.1.72:50010
> 2014-09-23 13:54:17,502 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Received blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: 
> /192.168.1.72:50010 size 17
> 2014-09-23 13:55:28,720 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Scheduling blk_8132629811771280764_1175 file 
> /space/disk1/current/blk_8132629811771280764 for deletion
> 2014-09-23 13:55:28,721 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Deleted blk_8132629811771280764_1175 at file 
> /space/disk1/current/blk_8132629811771280764
> The namenode still shows 3 replica's. even if one has been deleted, even 
> after more then 30 mins.
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 14:21:27 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, 
> 192.168.1.72:50010, 192.168.1.73:50010]
> This could be a dangerous, if someone remove or other 2 datanodes fail.
> On Datanode 1
> =
> Before, the datanode1 is brought back
> [hadoop@dn1 conf]$ ls -l /space/disk*/current
> /space/disk1/current:
> total 28
> -rw-rw-r-- 1 hadoop hadoop   13 Sep 21 09:09 blk_2278001646987517832
> -rw-rw-r-- 1 hadoop hadoop   11 Sep 21 09:09 blk_2278001646987517832_1171.meta
> -rw-rw-r-- 1 hadoop hadoop   17 Sep 23 13:54 blk_8132629811771280764
> -rw-rw-r-- 1 hadoop hadoop

[jira] [Commented] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced



[ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941376#comment-16941376
 ] 

Wei-Chiu Chuang commented on HDFS-14754:


Too bad this one didn't land in 3.2.1 and 3.1.3. Let's make sure they get added 
to lower releases.

> Erasure Coding :  The number of Under-Replicated Blocks never reduced
> -
>
> Key: HDFS-14754
> URL: https://issues.apache.org/jira/browse/HDFS-14754
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: HDFS-14754-addendum.001.patch, HDFS-14754.001.patch, 
> HDFS-14754.002.patch, HDFS-14754.003.patch, HDFS-14754.004.patch, 
> HDFS-14754.005.patch, HDFS-14754.006.patch, HDFS-14754.007.patch, 
> HDFS-14754.008.patch
>
>
> Using EC RS-3-2, 6 DN 
> We came accross a scenario where in the EC 5 blocks , same block is 
> replicated thrice and two blocks got missing
> Replicated block was not deleting and missing block is not able to ReConstruct



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced



[ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941376#comment-16941376
 ] 

Wei-Chiu Chuang edited comment on HDFS-14754 at 9/30/19 10:20 PM:
--

Too bad this one didn't land in 3.2.1 and 3.1.3. Let's make sure it gets added 
to lower releases.


was (Author: jojochuang):
Too bad this one didn't land in 3.2.1 and 3.1.3. Let's make sure they get added 
to lower releases.

> Erasure Coding :  The number of Under-Replicated Blocks never reduced
> -
>
> Key: HDFS-14754
> URL: https://issues.apache.org/jira/browse/HDFS-14754
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: HDFS-14754-addendum.001.patch, HDFS-14754.001.patch, 
> HDFS-14754.002.patch, HDFS-14754.003.patch, HDFS-14754.004.patch, 
> HDFS-14754.005.patch, HDFS-14754.006.patch, HDFS-14754.007.patch, 
> HDFS-14754.008.patch
>
>
> Using EC RS-3-2, 6 DN 
> We came accross a scenario where in the EC 5 blocks , same block is 
> replicated thrice and two blocks got missing
> Replicated block was not deleting and missing block is not able to ReConstruct



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced



 [ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14754:
---
Component/s: ec

> Erasure Coding :  The number of Under-Replicated Blocks never reduced
> -
>
> Key: HDFS-14754
> URL: https://issues.apache.org/jira/browse/HDFS-14754
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: HDFS-14754-addendum.001.patch, HDFS-14754.001.patch, 
> HDFS-14754.002.patch, HDFS-14754.003.patch, HDFS-14754.004.patch, 
> HDFS-14754.005.patch, HDFS-14754.006.patch, HDFS-14754.007.patch, 
> HDFS-14754.008.patch
>
>
> Using EC RS-3-2, 6 DN 
> We came accross a scenario where in the EC 5 blocks , same block is 
> replicated thrice and two blocks got missing
> Replicated block was not deleting and missing block is not able to ReConstruct



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis

2019-09-30 Thread Tsz-wo Sze (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941365#comment-16941365
 ] 

Tsz-wo Sze commented on HDDS-2169:
--

Please ignore the previous Jenkins build.  It was testing an old patch 
(o2169_20190923.patch, just removed).

> Avoid buffer copies while submitting client requests in Ratis
> -
>
> Key: HDDS-2169
> URL: https://issues.apache.org/jira/browse/HDDS-2169
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Shashikant Banerjee
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently, while sending write requests to Ratis from ozone, a protobuf 
> object containing data encoded  and then resultant protobuf is again 
> converted to a byteString which internally does a copy of the buffer embedded 
> inside the protobuf again so that it can be submitted over to Ratis client. 
> Again, while sending the appendRequest as well while building up the 
> appendRequestProto, it might be again copying the data. The idea here is to 
> provide client so pass the raw data(stateMachine data) separately to ratis 
> client without copying overhead. 
>  
> {code:java}
> private CompletableFuture sendRequestAsync(
> ContainerCommandRequestProto request) {
>   try (Scope scope = GlobalTracer.get()
>   .buildSpan("XceiverClientRatis." + request.getCmdType().name())
>   .startActive(true)) {
> ContainerCommandRequestProto finalPayload =
> ContainerCommandRequestProto.newBuilder(request)
> .setTraceID(TracingUtil.exportCurrentSpan())
> .build();
> boolean isReadOnlyRequest = HddsUtils.isReadOnly(finalPayload);
> //  finalPayload already has the byteString data embedded. 
> ByteString byteString = finalPayload.toByteString(); -> It involves a 
> copy again.
> if (LOG.isDebugEnabled()) {
>   LOG.debug("sendCommandAsync {} {}", isReadOnlyRequest,
>   sanitizeForDebug(finalPayload));
> }
> return isReadOnlyRequest ?
> getClient().sendReadOnlyAsync(() -> byteString) :
> getClient().sendAsync(() -> byteString);
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis

2019-09-30 Thread Tsz-wo Sze (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz-wo Sze updated HDDS-2169:
-
Attachment: (was: o2169_20190923.patch)

> Avoid buffer copies while submitting client requests in Ratis
> -
>
> Key: HDDS-2169
> URL: https://issues.apache.org/jira/browse/HDDS-2169
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Shashikant Banerjee
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently, while sending write requests to Ratis from ozone, a protobuf 
> object containing data encoded  and then resultant protobuf is again 
> converted to a byteString which internally does a copy of the buffer embedded 
> inside the protobuf again so that it can be submitted over to Ratis client. 
> Again, while sending the appendRequest as well while building up the 
> appendRequestProto, it might be again copying the data. The idea here is to 
> provide client so pass the raw data(stateMachine data) separately to ratis 
> client without copying overhead. 
>  
> {code:java}
> private CompletableFuture sendRequestAsync(
> ContainerCommandRequestProto request) {
>   try (Scope scope = GlobalTracer.get()
>   .buildSpan("XceiverClientRatis." + request.getCmdType().name())
>   .startActive(true)) {
> ContainerCommandRequestProto finalPayload =
> ContainerCommandRequestProto.newBuilder(request)
> .setTraceID(TracingUtil.exportCurrentSpan())
> .build();
> boolean isReadOnlyRequest = HddsUtils.isReadOnly(finalPayload);
> //  finalPayload already has the byteString data embedded. 
> ByteString byteString = finalPayload.toByteString(); -> It involves a 
> copy again.
> if (LOG.isDebugEnabled()) {
>   LOG.debug("sendCommandAsync {} {}", isReadOnlyRequest,
>   sanitizeForDebug(finalPayload));
> }
> return isReadOnlyRequest ?
> getClient().sendReadOnlyAsync(() -> byteString) :
> getClient().sendAsync(() -> byteString);
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2212) Genconf tool should generate config files for secure cluster setup

2019-09-30 Thread Dinesh Chitlangia (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Chitlangia updated HDDS-2212:

Component/s: Tools

> Genconf tool should generate config files for secure cluster setup
> --
>
> Key: HDDS-2212
> URL: https://issues.apache.org/jira/browse/HDDS-2212
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Dinesh Chitlangia
>Priority: Major
>  Labels: newbie
>
> Ozone Genconf tool currently generates a minimal ozone-site.xml file.
> [~raje2411] was trying out a secure ozone setup over existing HDP-2.x cluster 
> and found the config set up was not as straight forward.
> This jira proposes to extend the Genconf tool so we can generate required 
> template config files for a secure setup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2212) Genconf tool should generate config files for secure cluster setup

2019-09-30 Thread Dinesh Chitlangia (Jira)

Dinesh Chitlangia created HDDS-2212:
---

 Summary: Genconf tool should generate config files for secure 
cluster setup
 Key: HDDS-2212
 URL: https://issues.apache.org/jira/browse/HDDS-2212
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Dinesh Chitlangia


Ozone Genconf tool currently generates a minimal ozone-site.xml file.

[~raje2411] was trying out a secure ozone setup over existing HDP-2.x cluster 
and found the config set up was not as straight forward.

This jira proposes to extend the Genconf tool so we can generate required 
template config files for a secure setup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception



 [ 
https://issues.apache.org/jira/browse/HDDS-2210?focusedWorklogId=320844=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320844
 ]

ASF GitHub Bot logged work on HDDS-2210:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:35
Start Date: 30/Sep/19 20:35
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1552: HDDS-2210. 
ContainerStateMachine should not be marked unhealthy if ap…
URL: https://github.com/apache/hadoop/pull/1552#issuecomment-536740211
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 1773 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 25 | Maven dependency ordering for branch |
   | -1 | mvninstall | 44 | hadoop-hdds in trunk failed. |
   | -1 | mvninstall | 38 | hadoop-ozone in trunk failed. |
   | -1 | compile | 19 | hadoop-hdds in trunk failed. |
   | -1 | compile | 12 | hadoop-ozone in trunk failed. |
   | +1 | checkstyle | 58 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 937 | branch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 19 | hadoop-hdds in trunk failed. |
   | -1 | javadoc | 17 | hadoop-ozone in trunk failed. |
   | 0 | spotbugs | 1027 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | -1 | findbugs | 33 | hadoop-hdds in trunk failed. |
   | -1 | findbugs | 17 | hadoop-ozone in trunk failed. |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 26 | Maven dependency ordering for patch |
   | -1 | mvninstall | 33 | hadoop-hdds in the patch failed. |
   | -1 | mvninstall | 35 | hadoop-ozone in the patch failed. |
   | -1 | compile | 21 | hadoop-hdds in the patch failed. |
   | -1 | compile | 15 | hadoop-ozone in the patch failed. |
   | -1 | javac | 21 | hadoop-hdds in the patch failed. |
   | -1 | javac | 15 | hadoop-ozone in the patch failed. |
   | +1 | checkstyle | 53 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | -1 | whitespace | 0 | The patch has 1 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply |
   | +1 | shadedclient | 800 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 19 | hadoop-hdds in the patch failed. |
   | -1 | javadoc | 16 | hadoop-ozone in the patch failed. |
   | -1 | findbugs | 30 | hadoop-hdds in the patch failed. |
   | -1 | findbugs | 17 | hadoop-ozone in the patch failed. |
   ||| _ Other Tests _ |
   | -1 | unit | 24 | hadoop-hdds in the patch failed. |
   | -1 | unit | 24 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 29 | The patch does not generate ASF License warnings. |
   | | | 4251 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1552 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 1dedef292033 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 4d3c580 |
   | Default Java | 1.8.0_222 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-mvninstall-hadoop-hdds.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-mvninstall-hadoop-ozone.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-compile-hadoop-hdds.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-compile-hadoop-ozone.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-javadoc-hadoop-hdds.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-javadoc-hadoop-ozone.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-findbugs-hadoop-hdds.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-findbugs-hadoop-ozone.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/patch-mvninstall-hadoop-hdds.txt
 |
   |

[jira] [Work logged] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception



 [ 
https://issues.apache.org/jira/browse/HDDS-2210?focusedWorklogId=320843=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320843
 ]

ASF GitHub Bot logged work on HDDS-2210:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:35
Start Date: 30/Sep/19 20:35
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #1552: 
HDDS-2210. ContainerStateMachine should not be marked unhealthy if ap…
URL: https://github.com/apache/hadoop/pull/1552#discussion_r329774113
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java
 ##
 @@ -270,12 +270,16 @@ public void persistContainerSet(OutputStream out) throws 
IOException {
 // container happens outside of Ratis.
 IOUtils.write(builder.build().toByteArray(), out);
   }
+  
 
 Review comment:
   whitespace:end of line
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320843)
Time Spent: 20m  (was: 10m)

> ContainerStateMachine should not be marked unhealthy if applyTransaction 
> fails with closed container exception
> --
>
> Key: HDDS-2210
> URL: https://issues.apache.org/jira/browse/HDDS-2210
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, if applyTransaction fails, the stateMachine is marked unhealthy 
> and next snapshot creation will fail. As a result of which the the raftServer 
> will close down leading to pipeline failure. ClosedContainer exception should 
> be ignored while marking the stateMachine unhealthy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2211) Collect docker logs if env fails to start

e Spent: 0.5h
>  Remaining Estimate: 0h
>
> Occasionally some acceptance test docker environment fails to start up 
> properly.  We need docker logs for analysis, but they are not being collected.
> https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-09-30 Thread Jonathan Hung (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated HDFS-14305:
-
Target Version/s: 2.10.0
  Labels: multi-sbnn release-blocker  (was: multi-sbnn)

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: multi-sbnn, release-blocker
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, 
> HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, 
> HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14814) RBF: RouterQuotaUpdateService supports inherited rule.

2019-09-30 Thread Jira



[ 
https://issues.apache.org/jira/browse/HDFS-14814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941301#comment-16941301
 ] 

Íñigo Goiri commented on HDFS-14814:


Regarding {{setQuota()}}, keep in mind we should keep the signature from 
ClientProtocol:
{code}
  @Idempotent
  void setQuota(String path, long namespaceQuota, long storagespaceQuota,
  StorageType type) throws IOException;
{code}
We cannot set Override because this is not a full impl.
So let's keep the old method and add the extra one just for 
RouterQuotaUpdateService.
We could potentially also make it package protected instead of public.
(Same for all the aux methods.)

I think that in thel loop in getGlobalQuota, you could just do the ifs, and not 
do the if with the break, you will get the same number of comparissons.
In addition, I don't think getGlobalQuota() needs the check for operation as 
it's not supposed to be invoked through RPC directly.
As I mentioned before, let's make all these aux methods package private to 
distinguish them from the public API ones.
Maybe even adding some annotation?

For the log in RouterQuotaUpdateService#fixGlobalQuota we should fully leverage 
logger.
{code}
162   LOG.info("[Fix Quota] src={} dst={} oldQuota={}/{} 
newQuota={}/{}",
163   location.getSrc(), location,
164   remoteQuota.getQuota(), remoteQuota.getSpaceQuota(),
165   gQuota.getQuota(), gQuota.getSpaceQuota());
{code}

> RBF: RouterQuotaUpdateService supports inherited rule.
> --
>
> Key: HDFS-14814
> URL: https://issues.apache.org/jira/browse/HDFS-14814
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-14814.001.patch, HDFS-14814.002.patch, 
> HDFS-14814.003.patch, HDFS-14814.004.patch, HDFS-14814.005.patch, 
> HDFS-14814.006.patch, HDFS-14814.007.patch, HDFS-14814.008.patch, 
> HDFS-14814.009.patch, HDFS-14814.010.patch
>
>
> I want to add a rule *'The quota should be set the same as the nearest 
> parent'* to Global Quota. Supposing we have the mount table below.
> M1: /dir-a                            ns0->/dir-a     \{nquota=10,squota=20}
> M2: /dir-a/dir-b                 ns1->/dir-b     \{nquota=-1,squota=30}
> M3: /dir-a/dir-b/dir-c       ns2->/dir-c     \{nquota=-1,squota=-1}
> M4: /dir-d                           ns3->/dir-d     \{nquota=-1,squota=-1}
>  
> The quota for the remote locations on the namespaces should be:
>  ns0->/dir-a     \{nquota=10,squota=20}
>  ns1->/dir-b     \{nquota=10,squota=30}
>  ns2->/dir-c      \{nquota=10,squota=30}
>  ns3->/dir-d     \{nquota=-1,squota=-1}
>  
> The quota of the remote location is set the same as the corresponding 
> MountTable, and if there is no quota of the MountTable then the quota is set 
> to the nearest parent MountTable with quota.
>  
> It's easy to implement it. In RouterQuotaUpdateService each time we compute 
> the currentQuotaUsage, we can get the quota info for each MountTable. We can 
> do a
>  check and fix all the MountTable which's quota doesn't match the rule above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2199) In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host



 [ 
https://issues.apache.org/jira/browse/HDDS-2199?focusedWorklogId=320821=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320821
 ]

ASF GitHub Bot logged work on HDDS-2199:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:05
Start Date: 30/Sep/19 20:05
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1551: HDDS-2199 In 
SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
URL: https://github.com/apache/hadoop/pull/1551#issuecomment-536728763
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 39 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 3 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | -1 | mvninstall | 31 | hadoop-hdds in trunk failed. |
   | -1 | mvninstall | 33 | hadoop-ozone in trunk failed. |
   | -1 | compile | 21 | hadoop-hdds in trunk failed. |
   | -1 | compile | 15 | hadoop-ozone in trunk failed. |
   | +1 | checkstyle | 50 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 843 | branch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 22 | hadoop-hdds in trunk failed. |
   | -1 | javadoc | 21 | hadoop-ozone in trunk failed. |
   | 0 | spotbugs | 943 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | -1 | findbugs | 33 | hadoop-hdds in trunk failed. |
   | -1 | findbugs | 20 | hadoop-ozone in trunk failed. |
   ||| _ Patch Compile Tests _ |
   | -1 | mvninstall | 36 | hadoop-hdds in the patch failed. |
   | -1 | mvninstall | 35 | hadoop-ozone in the patch failed. |
   | -1 | compile | 23 | hadoop-hdds in the patch failed. |
   | -1 | compile | 18 | hadoop-ozone in the patch failed. |
   | -1 | javac | 23 | hadoop-hdds in the patch failed. |
   | -1 | javac | 18 | hadoop-ozone in the patch failed. |
   | +1 | checkstyle | 55 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 703 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 22 | hadoop-hdds in the patch failed. |
   | -1 | javadoc | 20 | hadoop-ozone in the patch failed. |
   | -1 | findbugs | 31 | hadoop-hdds in the patch failed. |
   | -1 | findbugs | 21 | hadoop-ozone in the patch failed. |
   ||| _ Other Tests _ |
   | -1 | unit | 29 | hadoop-hdds in the patch failed. |
   | -1 | unit | 27 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 34 | The patch does not generate ASF License warnings. |
   | | | 2320 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1551 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 59d2934bbdb9 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 4d3c580 |
   | Default Java | 1.8.0_222 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-mvninstall-hadoop-hdds.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-mvninstall-hadoop-ozone.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-compile-hadoop-hdds.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-compile-hadoop-ozone.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-javadoc-hadoop-hdds.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-javadoc-hadoop-ozone.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-findbugs-hadoop-hdds.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-findbugs-hadoop-ozone.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/patch-mvninstall-hadoop-hdds.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/patch-mvninstall-hadoop-ozone.txt
 |
   | compile |

[jira] [Commented] (HDFS-13270) RBF: Router audit logger

2019-09-30 Thread Jira



[ 
https://issues.apache.org/jira/browse/HDFS-13270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941294#comment-16941294
 ] 

Íñigo Goiri commented on HDFS-13270:


Given the architecture with RBF, I think the most important is not to mimic the 
behavior of the NN fully.
I think what should be clear is that we executed an operation on a particular 
NN.
We should focus on making easy to correlate NN audit logs with Router audit 
logs.

> RBF: Router audit logger
> 
>
> Key: HDFS-13270
> URL: https://issues.apache.org/jira/browse/HDFS-13270
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Affects Versions: 3.2.0
>Reporter: maobaolong
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-13270.001.patch, HDFS-13270.002.patch, 
> HDFS-13270.003.patch
>
>
> We can use router auditlogger to log the client info and cmd, because the 
> FSNamesystem#Auditlogger's log think the client are all from router.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14495) RBF: Duplicate FederationRPCMetrics

2019-09-30 Thread Jira



[ 
https://issues.apache.org/jira/browse/HDFS-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941292#comment-16941292
 ] 

Íñigo Goiri commented on HDFS-14495:


The only problem I would see is on backwards compatibility.
Not sure what the guideline is for JMX, etc.
I would be concerned about removing a JMX bean for example.

> RBF: Duplicate FederationRPCMetrics
> ---
>
> Key: HDFS-14495
> URL: https://issues.apache.org/jira/browse/HDFS-14495
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: metrics
>Reporter: Akira Ajisaka
>Assignee: hemanthboyina
>Priority: Major
>
> There are two FederationRPCMetrics displayed in Web UI (http:// hostname>:/jmx) and most of the metrics are the same.
> * FederationRPCMetrics via {{@Metrics}} and {{@Metric}} annotations
> * FederationRPCMetrics via registering FederationRPCMBean
> Can we remove {{@Metrics}} and {{@Metric}} annotations to remove duplication?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2211) Collect docker logs if env fails to start



 [ 
https://issues.apache.org/jira/browse/HDDS-2211?focusedWorklogId=320818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320818
 ]

ASF GitHub Bot logged work on HDDS-2211:


Author: ASF GitHub Bot
Created on: 30/Sep/19 19:51
Start Date: 30/Sep/19 19:51
Worklog Time Spent: 10m 
  Work Description: adoroszlai commented on issue #1553: HDDS-2211. Collect 
docker logs if env fails to start
URL: https://github.com/apache/hadoop/pull/1553#issuecomment-536723744
 
 
   /label ozone
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320818)
Time Spent: 20m  (was: 10m)

> Collect docker logs if env fails to start
> -
>
> Key: HDDS-2211
> URL: https://issues.apache.org/jira/browse/HDDS-2211
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Occasionally some acceptance test docker environment fails to start up 
> properly.  We need docker logs for analysis, but they are not being collected.
> https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2211) Collect docker logs if env fails to start



 [ 
https://issues.apache.org/jira/browse/HDDS-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2211:
-
Labels: pull-request-available  (was: )

> Collect docker logs if env fails to start
> -
>
> Key: HDDS-2211
> URL: https://issues.apache.org/jira/browse/HDDS-2211
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>
> Occasionally some acceptance test docker environment fails to start up 
> properly.  We need docker logs for analysis, but they are not being collected.
> https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2211) Collect docker logs if env fails to start



 [ 
https://issues.apache.org/jira/browse/HDDS-2211?focusedWorklogId=320816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320816
 ]

ASF GitHub Bot logged work on HDDS-2211:


Author: ASF GitHub Bot
Created on: 30/Sep/19 19:50
Start Date: 30/Sep/19 19:50
Worklog Time Spent: 10m 
  Work Description: adoroszlai commented on pull request #1553: HDDS-2211. 
Collect docker logs if env fails to start
URL: https://github.com/apache/hadoop/pull/1553
 
 
   ## What changes were proposed in this pull request?
   
   1. Collect docker logs if environment fails to start up (previously it was 
collected only after actual test run)
   2. Copy docker logs to aggregate result directory
   3. Fail fast if datanodes cannot be started
   4. Avoid [ANSI codes in 
output](https://github.com/elek/ozone-ci-q4/blob/f700ebb2254527527960593496f15008245094d6/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3774)
   5. Sort test cases for consistent run ordering (Robot summary table is 
already sorted)
   
   https://issues.apache.org/jira/browse/HDDS-2211
   
   ## How was this patch tested?
   
   Ran `acceptance.sh` locally.  Also simulated failure during datanode 
startup, verified that docker log is saved.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320816)
Remaining Estimate: 0h
Time Spent: 10m

> Collect docker logs if env fails to start
> -
>
> Key: HDDS-2211
> URL: https://issues.apache.org/jira/browse/HDDS-2211
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Occasionally some acceptance test docker environment fails to start up 
> properly.  We need docker logs for analysis, but they are not being collected.
> https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2019) Handle Set DtService of token in S3Gateway for OM HA



 [ 
https://issues.apache.org/jira/browse/HDDS-2019?focusedWorklogId=320795=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320795
 ]

ASF GitHub Bot logged work on HDDS-2019:


Author: ASF GitHub Bot
Created on: 30/Sep/19 19:29
Start Date: 30/Sep/19 19:29
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #1489: HDDS-2019. 
Handle Set DtService of token in S3Gateway for OM HA.
URL: https://github.com/apache/hadoop/pull/1489#discussion_r329748308
 
 

 ##
 File path: 
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/OzoneServiceProvider.java
 ##
 @@ -20,33 +20,75 @@
 import org.apache.hadoop.hdds.conf.OzoneConfiguration;
 import org.apache.hadoop.io.Text;
 import org.apache.hadoop.ozone.OmUtils;
+import org.apache.hadoop.ozone.s3.util.OzoneS3Util;
 import org.apache.hadoop.security.SecurityUtil;
 
 import javax.annotation.PostConstruct;
 import javax.enterprise.context.ApplicationScoped;
 import javax.enterprise.inject.Produces;
 import javax.inject.Inject;
+
+import java.util.Arrays;
+import java.util.Collection;
+
+import static org.apache.hadoop.ozone.om.OMConfigKeys.OZONE_OM_NODES_KEY;
+import static org.apache.hadoop.ozone.om.OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY;
+
 /**
  * This class creates the OM service .
  */
 @ApplicationScoped
 public class OzoneServiceProvider {
 
-  private Text omServiceAdd;
+  private Text omServiceAddr;
+
+  private String omserviceID;
 
   @Inject
   private OzoneConfiguration conf;
 
   @PostConstruct
   public void init() {
-omServiceAdd = SecurityUtil.buildTokenService(OmUtils.
-getOmAddressForClients(conf));
+Collection serviceIdList =
+conf.getTrimmedStringCollection(OZONE_OM_SERVICE_IDS_KEY);
+if (serviceIdList.size() == 0) {
+  // Non-HA cluster
+  omServiceAddr = SecurityUtil.buildTokenService(OmUtils.
+  getOmAddressForClients(conf));
+} else {
+  // HA cluster.
+  //For now if multiple service id's are configured we throw exception.
 
 Review comment:
   This is very similar to what HDFS HA does. Since OM support service 
discovery, can we get the internal service id via service discovery?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320795)
Time Spent: 3h 20m  (was: 3h 10m)

> Handle Set DtService of token in S3Gateway for OM HA
> 
>
> Key: HDDS-2019
> URL: https://issues.apache.org/jira/browse/HDDS-2019
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> When OM HA is enabled, when tokens are generated, the service name should be 
> set with address of all OM's.
>  
> Current without HA, it is set with Om RpcAddress string. This Jira is to 
> handle:
>  # Set dtService with all OM address. Right now in OMClientProducer, UGI is 
> created with S3 token, and serviceName of token is set with OMAddress, for HA 
> case, this should be set with all OM RPC addresses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2019) Handle Set DtService of token in S3Gateway for OM HA



 [ 
https://issues.apache.org/jira/browse/HDDS-2019?focusedWorklogId=320788=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320788
 ]

ASF GitHub Bot logged work on HDDS-2019:


Author: ASF GitHub Bot
Created on: 30/Sep/19 19:25
Start Date: 30/Sep/19 19:25
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #1489: HDDS-2019. 
Handle Set DtService of token in S3Gateway for OM HA.
URL: https://github.com/apache/hadoop/pull/1489#discussion_r329746422
 
 

 ##
 File path: 
hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/util/TestOzoneS3Util.java
 ##
 @@ -0,0 +1,129 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.s3.util;
+
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.ozone.OmUtils;
+import org.apache.hadoop.ozone.om.OMConfigKeys;
+import org.apache.hadoop.security.SecurityUtil;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.util.Collection;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP;
+import static org.junit.Assert.fail;
+
+/**
+ * Class used to test OzoneS3Util.
+ */
+public class TestOzoneS3Util {
+
+
+  private OzoneConfiguration configuration;
+
+  @Before
+  public void setConf() {
+configuration = new OzoneConfiguration();
+String serviceID = "omService";
+String nodeIDs = "om1,om2,om3";
+configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID);
+configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs);
+configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false);
+  }
+
+  @Test
+  public void testBuildServiceNameForToken() {
+String serviceID = "omService";
+String nodeIDs = "om1,om2,om3";
+configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID);
+configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs);
+configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false);
+
+Collection nodeIDList = configuration.getStringCollection(
+OMConfigKeys.OZONE_OM_NODE_ID_KEY);
+
+String expectedOmServiceAddress = buildOMNodeAddresses(nodeIDList,
+serviceID);
+
+SecurityUtil.setConfiguration(configuration);
+String omserviceAddr = OzoneS3Util.buildServiceNameForToken(configuration,
+serviceID, nodeIDList);
+
+Assert.assertEquals(expectedOmServiceAddress, omserviceAddr);
+  }
+
+  @Test
+  public void testBuildServiceNameForTokenIncorrectConfig() {
+String serviceID = "omService";
+String nodeIDs = "om1,om2,om3";
+configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID);
+configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs);
+configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false);
+
+Collection nodeIDList = configuration.getStringCollection(
+OMConfigKeys.OZONE_OM_NODE_ID_KEY);
+
+// Don't set om3 node rpc address.
+configuration.set(OmUtils.addKeySuffixes(OMConfigKeys.OZONE_OM_ADDRESS_KEY,
+serviceID, "om1"), "om1:9862");
+configuration.set(OmUtils.addKeySuffixes(OMConfigKeys.OZONE_OM_ADDRESS_KEY,
+serviceID, "om2"), "om2:9862");
+
+
+SecurityUtil.setConfiguration(configuration);
+
+try {
+  OzoneS3Util.buildServiceNameForToken(configuration,
+  serviceID, nodeIDList);
+  fail("testBuildServiceNameForTokenIncorrectConfig failed");
+} catch (IllegalArgumentException ex) {
+  GenericTestUtils.assertExceptionContains("Could not find rpcAddress " +
+  "for", ex);
+}
+
+
+  }
+
+
+  private String buildOMNodeAddresses(Collection nodeIDList,
+  String serviceID) {
+StringBuilder omServiceAddrBuilder = new StringBuilder();
+int port = 9862;
+int nodesLength = nodeIDList.size();
+int counter = 0;
+for (String nodeID : nodeIDList) {
+  counter++;
+  String addr = nodeID + ":" + port++;
+  configuration.set(OmUtils.addKeySuffixes(
+

[jira] [Work logged] (HDDS-2019) Handle Set DtService of token in S3Gateway for OM HA



 [ 
https://issues.apache.org/jira/browse/HDDS-2019?focusedWorklogId=320787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320787
 ]

ASF GitHub Bot logged work on HDDS-2019:


Author: ASF GitHub Bot
Created on: 30/Sep/19 19:24
Start Date: 30/Sep/19 19:24
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #1489: HDDS-2019. 
Handle Set DtService of token in S3Gateway for OM HA.
URL: https://github.com/apache/hadoop/pull/1489#discussion_r329746422
 
 

 ##
 File path: 
hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/util/TestOzoneS3Util.java
 ##
 @@ -0,0 +1,129 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.s3.util;
+
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.ozone.OmUtils;
+import org.apache.hadoop.ozone.om.OMConfigKeys;
+import org.apache.hadoop.security.SecurityUtil;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.util.Collection;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP;
+import static org.junit.Assert.fail;
+
+/**
+ * Class used to test OzoneS3Util.
+ */
+public class TestOzoneS3Util {
+
+
+  private OzoneConfiguration configuration;
+
+  @Before
+  public void setConf() {
+configuration = new OzoneConfiguration();
+String serviceID = "omService";
+String nodeIDs = "om1,om2,om3";
+configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID);
+configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs);
+configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false);
+  }
+
+  @Test
+  public void testBuildServiceNameForToken() {
+String serviceID = "omService";
+String nodeIDs = "om1,om2,om3";
+configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID);
+configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs);
+configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false);
+
+Collection nodeIDList = configuration.getStringCollection(
+OMConfigKeys.OZONE_OM_NODE_ID_KEY);
+
+String expectedOmServiceAddress = buildOMNodeAddresses(nodeIDList,
+serviceID);
+
+SecurityUtil.setConfiguration(configuration);
+String omserviceAddr = OzoneS3Util.buildServiceNameForToken(configuration,
+serviceID, nodeIDList);
+
+Assert.assertEquals(expectedOmServiceAddress, omserviceAddr);
+  }
+
+  @Test
+  public void testBuildServiceNameForTokenIncorrectConfig() {
+String serviceID = "omService";
+String nodeIDs = "om1,om2,om3";
+configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID);
+configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs);
+configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false);
+
+Collection nodeIDList = configuration.getStringCollection(
+OMConfigKeys.OZONE_OM_NODE_ID_KEY);
+
+// Don't set om3 node rpc address.
+configuration.set(OmUtils.addKeySuffixes(OMConfigKeys.OZONE_OM_ADDRESS_KEY,
+serviceID, "om1"), "om1:9862");
+configuration.set(OmUtils.addKeySuffixes(OMConfigKeys.OZONE_OM_ADDRESS_KEY,
+serviceID, "om2"), "om2:9862");
+
+
+SecurityUtil.setConfiguration(configuration);
+
+try {
+  OzoneS3Util.buildServiceNameForToken(configuration,
+  serviceID, nodeIDList);
+  fail("testBuildServiceNameForTokenIncorrectConfig failed");
+} catch (IllegalArgumentException ex) {
+  GenericTestUtils.assertExceptionContains("Could not find rpcAddress " +
+  "for", ex);
+}
+
+
+  }
+
+
+  private String buildOMNodeAddresses(Collection nodeIDList,
+  String serviceID) {
+StringBuilder omServiceAddrBuilder = new StringBuilder();
+int port = 9862;
+int nodesLength = nodeIDList.size();
+int counter = 0;
+for (String nodeID : nodeIDList) {
+  counter++;
+  String addr = nodeID + ":" + port++;
+  configuration.set(OmUtils.addKeySuffixes(
+

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941272#comment-16941272
 ] 

Konstantin Shvachko commented on HDFS-14305:


Attached v08, which fixes {{TestFailoverWithBlockTokensEnabled}} and findbugs 
warning.
[~jojochuang], correct this problem is in 2.10 as well. Trying to fix it before 
the release.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: multi-sbnn
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, 
> HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, 
> HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes



 [ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-14305:
---
Attachment: HDFS-14305-008.patch

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: multi-sbnn
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, 
> HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, 
> HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception



 [ 
https://issues.apache.org/jira/browse/HDDS-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2210:
-
Labels: pull-request-available  (was: )

> ContainerStateMachine should not be marked unhealthy if applyTransaction 
> fails with closed container exception
> --
>
> Key: HDDS-2210
> URL: https://issues.apache.org/jira/browse/HDDS-2210
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>
> Currently, if applyTransaction fails, the stateMachine is marked unhealthy 
> and next snapshot creation will fail. As a result of which the the raftServer 
> will close down leading to pipeline failure. ClosedContainer exception should 
> be ignored while marking the stateMachine unhealthy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception

2019-09-30 Thread Shashikant Banerjee (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-2210:
--
Status: Patch Available  (was: Open)

> ContainerStateMachine should not be marked unhealthy if applyTransaction 
> fails with closed container exception
> --
>
> Key: HDDS-2210
> URL: https://issues.apache.org/jira/browse/HDDS-2210
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, if applyTransaction fails, the stateMachine is marked unhealthy 
> and next snapshot creation will fail. As a result of which the the raftServer 
> will close down leading to pipeline failure. ClosedContainer exception should 
> be ignored while marking the stateMachine unhealthy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception



 [ 
https://issues.apache.org/jira/browse/HDDS-2210?focusedWorklogId=320782=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320782
 ]

ASF GitHub Bot logged work on HDDS-2210:


Author: ASF GitHub Bot
Created on: 30/Sep/19 19:14
Start Date: 30/Sep/19 19:14
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1552: HDDS-2210. 
ContainerStateMachine should not be marked unhealthy if ap…
URL: https://github.com/apache/hadoop/pull/1552
 
 
   …plyTransaction fails with closed container exception.
   
   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.)
   For more details, please see 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320782)
Remaining Estimate: 0h
Time Spent: 10m

> ContainerStateMachine should not be marked unhealthy if applyTransaction 
> fails with closed container exception
> --
>
> Key: HDDS-2210
> URL: https://issues.apache.org/jira/browse/HDDS-2210
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, if applyTransaction fails, the stateMachine is marked unhealthy 
> and next snapshot creation will fail. As a result of which the the raftServer 
> will close down leading to pipeline failure. ClosedContainer exception should 
> be ignored while marking the stateMachine unhealthy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2019) Handle Set DtService of token in S3Gateway for OM HA



 [ 
https://issues.apache.org/jira/browse/HDDS-2019?focusedWorklogId=320780=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320780
 ]

ASF GitHub Bot logged work on HDDS-2019:


Author: ASF GitHub Bot
Created on: 30/Sep/19 19:09
Start Date: 30/Sep/19 19:09
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #1489: HDDS-2019. 
Handle Set DtService of token in S3Gateway for OM HA.
URL: https://github.com/apache/hadoop/pull/1489#discussion_r329740302
 
 

 ##
 File path: 
hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/util/TestOzoneS3Util.java
 ##
 @@ -0,0 +1,129 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.s3.util;
+
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.ozone.OmUtils;
+import org.apache.hadoop.ozone.om.OMConfigKeys;
+import org.apache.hadoop.security.SecurityUtil;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.util.Collection;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP;
+import static org.junit.Assert.fail;
+
+/**
+ * Class used to test OzoneS3Util.
+ */
+public class TestOzoneS3Util {
+
+
+  private OzoneConfiguration configuration;
+
+  @Before
+  public void setConf() {
+configuration = new OzoneConfiguration();
+String serviceID = "omService";
+String nodeIDs = "om1,om2,om3";
+configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID);
+configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs);
+configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false);
+  }
+
+  @Test
+  public void testBuildServiceNameForToken() {
+String serviceID = "omService";
 
 Review comment:
   Line 55-59 is redundant since you have them in setConf before each test.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320780)
Time Spent: 2h 50m  (was: 2h 40m)

> Handle Set DtService of token in S3Gateway for OM HA
> 
>
> Key: HDDS-2019
> URL: https://issues.apache.org/jira/browse/HDDS-2019
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> When OM HA is enabled, when tokens are generated, the service name should be 
> set with address of all OM's.
>  
> Current without HA, it is set with Om RpcAddress string. This Jira is to 
> handle:
>  # Set dtService with all OM address. Right now in OMClientProducer, UGI is 
> created with S3 token, and serviceName of token is set with OMAddress, for HA 
> case, this should be set with all OM RPC addresses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs

2019-09-30 Thread Mukul Kumar Singh (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941254#comment-16941254
 ] 

Mukul Kumar Singh commented on HDFS-14884:
--

 [^hdfs_distcp.patch] reproes the failure

> Add sanity check that zone key equals feinfo key while setting Xattrs
> -
>
> Key: HDFS-14884
> URL: https://issues.apache.org/jira/browse/HDFS-14884
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, hdfs
>Reporter: Mukul Kumar Singh
>Priority: Major
> Attachments: hdfs_distcp.patch
>
>
> Currently, it is possible to set an external attribute where the  zone key is 
> not the same as  feinfo key. This jira will add a precondition before setting 
> this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs

2019-09-30 Thread Mukul Kumar Singh (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-14884:
-
Attachment: hdfs_distcp.patch

> Add sanity check that zone key equals feinfo key while setting Xattrs
> -
>
> Key: HDFS-14884
> URL: https://issues.apache.org/jira/browse/HDFS-14884
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, hdfs
>Reporter: Mukul Kumar Singh
>Priority: Major
> Attachments: hdfs_distcp.patch
>
>
> Currently, it is possible to set an external attribute where the  zone key is 
> not the same as  feinfo key. This jira will add a precondition before setting 
> this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14528) Failover from Active to Standby Failed



 [ 
https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14528:
---
Labels: multi-sbnn  (was: )

> Failover from Active to Standby Failed  
> 
>
> Key: HDFS-14528
> URL: https://issues.apache.org/jira/browse/HDFS-14528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>  Labels: multi-sbnn
> Attachments: HDFS-14528.003.patch, HDFS-14528.004.patch, 
> HDFS-14528.005.patch, HDFS-14528.2.Patch, ZKFC_issue.patch
>
>
>  *In a cluster with more than one Standby namenode, manual failover throws 
> exception for some cases*
> *When trying to exectue the failover command from active to standby* 
> *._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*
>   Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
> connection exception: java.net.ConnectException: Connection refused
> This is encountered in the following cases :
>  Scenario 1 : 
> Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> When trying to manually failover from NN1 to NN2 if NN3 is down, Exception is 
> thrown
> Scenario 2 :
>  Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3
> When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
> down, Exception is thrown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14855) client always print standbyexception info with multi standby namenode



 [ 
https://issues.apache.org/jira/browse/HDFS-14855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14855:
---
Labels: multi-sbnn  (was: )

> client always print standbyexception info with multi standby namenode
> -
>
> Key: HDFS-14855
> URL: https://issues.apache.org/jira/browse/HDFS-14855
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Shen Yinjie
>Assignee: Shen Yinjie
>Priority: Major
>  Labels: multi-sbnn
> Attachments: image-2019-09-19-20-04-54-591.png
>
>
> When cluster has more than two standby namenodes,  client executes shell will 
> print standbyexception info. May we change the log level from INFO to DEBUG,  
>  !image-2019-09-19-20-04-54-591.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work started] (HDDS-2211) Collect docker logs if env fails to start

2019-09-30 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDDS-2211 started by Attila Doroszlai.
--
> Collect docker logs if env fails to start
> -
>
> Key: HDDS-2211
> URL: https://issues.apache.org/jira/browse/HDDS-2211
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>
> Occasionally some acceptance test docker environment fails to start up 
> properly.  We need docker logs for analysis, but they are not being collected.
> https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14201) Ability to disallow safemode NN to become active



 [ 
https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14201:
---
Labels: multi-sbnn  (was: )

> Ability to disallow safemode NN to become active
> 
>
> Key: HDFS-14201
> URL: https://issues.apache.org/jira/browse/HDFS-14201
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 3.1.1, 2.9.2
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: multi-sbnn
> Fix For: 3.3.0
>
> Attachments: HDFS-14201.001.patch, HDFS-14201.002.patch, 
> HDFS-14201.003.patch, HDFS-14201.004.patch, HDFS-14201.005.patch, 
> HDFS-14201.006.patch, HDFS-14201.007.patch, HDFS-14201.008.patch, 
> HDFS-14201.009.patch
>
>
> Currently with HA, Namenode in safemode can be possibly selected as active, 
> for availability of both read and write, Namenodes not in safemode are better 
> choices to become active though.
> It can take tens of minutes for a cold started Namenode to get out of 
> safemode, especially when there are large number of files and blocks in HDFS, 
> that means if a Namenode in safemode become active, the cluster will be not 
> fully functioning for quite a while, even if it can while there is some 
> Namenode not in safemode.
> The proposal here is to add an option, to allow Namenode to report itself as 
> UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning 
> Namenode to become active, improving the general availability of the cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941238#comment-16941238
 ] 

Wei-Chiu Chuang commented on HDFS-14305:


HDFS-6440 was backported into branch-2 by HDFS-14205. I'm assuming the issue in 
debate also impacts 2.10 release?

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: multi-sbnn
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14305-007.patch, HDFS-14305.001.patch, 
> HDFS-14305.002.patch, HDFS-14305.003.patch, HDFS-14305.004.patch, 
> HDFS-14305.005.patch, HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14793) BlockTokenSecretManager should LOG block token range it operates on.



[ 
https://issues.apache.org/jira/browse/HDFS-14793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941235#comment-16941235
 ] 

Wei-Chiu Chuang commented on HDFS-14793:


Looks good, but it is superseded by HDFS-14305.

> BlockTokenSecretManager should LOG block token range it operates on.
> 
>
> Key: HDFS-14793
> URL: https://issues.apache.org/jira/browse/HDFS-14793
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14793.001.patch
>
>
> At startup log enough information to identified the range of block token keys 
> for the NameNode. This should make it easier to debug issues with block 
> tokens.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14378) Simplify the design of multiple NN and both logic of edit log roll and checkpoint



 [ 
https://issues.apache.org/jira/browse/HDFS-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14378:
---
Labels: multi-sbnn  (was: )

> Simplify the design of multiple NN and both logic of edit log roll and 
> checkpoint
> -
>
> Key: HDFS-14378
> URL: https://issues.apache.org/jira/browse/HDFS-14378
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, namenode
>Affects Versions: 3.1.2
>Reporter: star
>Assignee: star
>Priority: Major
>  Labels: multi-sbnn
> Attachments: HDFS-14378-trunk.001.patch, HDFS-14378-trunk.002.patch, 
> HDFS-14378-trunk.003.patch, HDFS-14378-trunk.004.patch, 
> HDFS-14378-trunk.005.patch, HDFS-14378-trunk.006.patch
>
>
>       HDFS-6440 introduced a mechanism to support more than 2 NNs. It 
> implements a first-writer-win policy to avoid duplicated fsimage downloading. 
> Variable 'isPrimaryCheckPointer' is used to hold the first-writer state, with 
> which SNN will provide fsimage for ANN next time. Then we have three roles in 
> NN cluster: ANN, one primary SNN, one or more normal SNN.
>       Since HDFS-12248, there may be more than two primary SNN shortly after 
> a exception occurred. It takes care with a scenario  that SNN will not upload 
> fsimage on IOE and Interrupted exceptions. Though it will not cause any 
> further functional issues, it is inconsistent. 
>       Futher more, edit log may be rolled more frequently than necessary with 
> multiple Standby name nodes, HDFS-14349. (I'm not so sure about this, will 
> verify by unit tests or any one could point it out.)
>       Above all, I‘m wondering if we could make it simple with following 
> changes:
>  * There are only two roles:ANN, SNN
>  * ANN will roll its edit log every DFS_HA_LOGROLL_PERIOD_KEY period.
>  * ANN will select a SNN to download checkpoint.
> SNN will just do logtail and checkpoint. Then provide a servlet for fsimage 
> downloading as normal. SNN will not try to roll edit log or send checkpoint 
> request to ANN.
> In a word, ANN will be more active. Suggestions are welcomed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14646) Standby NameNode should not upload fsimage to an inappropriate NameNode.



 [ 
https://issues.apache.org/jira/browse/HDFS-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14646:
---
Labels: multi-sbnn  (was: )

> Standby NameNode should not upload fsimage to an inappropriate NameNode.
> 
>
> Key: HDFS-14646
> URL: https://issues.apache.org/jira/browse/HDFS-14646
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
>  Labels: multi-sbnn
> Attachments: HDFS-14646.000.patch, HDFS-14646.001.patch, 
> HDFS-14646.002.patch, HDFS-14646.003.patch, HDFS-14646.004.patch
>
>
> *Problem Description:*
>  In the multi-NameNode scenario, when a SNN uploads a FsImage, it will put 
> the image to all other NNs (whether the peer NN is an ANN or not), and even 
> if the peer NN immediately replies an error (such as 
> TransferResult.NOT_ACTIVE_NAMENODE_FAILURE, TransferResult 
> .OLD_TRANSACTION_ID_FAILURE, etc.), the local SNN will not terminate the put 
> process immediately, but will put the FsImage completely to the peer NN, and 
> will not read the peer NN's reply until the put is completed.
> Depending on the version of Jetty, this behavior can lead to different 
> consequences : 
> *1.Under Hadoop 2.7.2 (with Jetty 6.1.26)*
>  After peer NN called HttpServletResponse.sendError(), the underlying TCP 
> connection will still be established, and the data SNN sent will be read by 
> Jetty framework itself in the peer NN side, so the SNN will insignificantly 
> send the FsImage to the peer NN continuously, causing a waste of time and 
> bandwidth. In a relatively large HDFS cluster, the size of FsImage can often 
> reach about 30GB, This is indeed a big waste.
> *2.Under newest release-3.2.0-RC1 (with Jetty 9.3.24) and trunk (with Jetty 
> 9.3.27)*
>  After peer NN called HttpServletResponse.sendError(), the underlying TCP 
> connection will be auto closed, and then SNN will directly get an "Error 
> writing request body to server" exception, as below, note this test needs a 
> relatively big FSImage (e.g. 10MB level):
> {code:java}
> 2019-08-17 03:59:25,413 INFO namenode.TransferFsImage: Sending fileName: 
> /tmp/hadoop-root/dfs/name/current/fsimage_3364240, fileSize: 
> 9864721. Sent total: 524288 bytes. Size of last segment intended to send: 
> 4096 bytes.
>  java.io.IOException: Error writing request body to server
>  at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3587)
>  at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3570)
>  at 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.copyFileToStream(TransferFsImage.java:396)
>  at 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.writeFileToPutRequest(TransferFsImage.java:340)
>  at 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImage(TransferFsImage.java:314)
>  at 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImageFromStorage(TransferFsImage.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:277)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:272)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  2019-08-17 03:59:25,422 INFO namenode.TransferFsImage: Sending fileName: 
> /tmp/hadoop-root/dfs/name/current/fsimage_3364240, fileSize: 
> 9864721. Sent total: 851968 bytes. Size of last segment intended to send: 
> 4096 bytes.
>  java.io.IOException: Error writing request body to server
>  at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3587)
>  at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3570)
>  at 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.copyFileToStream(TransferFsImage.java:396)
>  at 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.writeFileToPutRequest(TransferFsImage.java:340)
>   {code}
>                   
> *Solution:*
>  A standby NameNode should not upload fsimage to an inappropriate NameNode, 
> when he plans to put a FsImage to the peer NN, he need to check whether he 
> really need to put it at this time.
> In detail, local SNN should establish an HTTP connection with the peer NN, 
> send the put request, and then immediately read the response (this is the key 
> point). If the peer

[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes



 [ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14305:
---
Labels: multi-sbnn  (was: )

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: Xiaoqiao He
>Priority: Major
>  Labels: multi-sbnn
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14305-007.patch, HDFS-14305.001.patch, 
> HDFS-14305.002.patch, HDFS-14305.003.patch, HDFS-14305.004.patch, 
> HDFS-14305.005.patch, HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2211) Collect docker logs if env fails to start

2019-09-30 Thread Attila Doroszlai (Jira)

Attila Doroszlai created HDDS-2211:
--

 Summary: Collect docker logs if env fails to start
 Key: HDDS-2211
 URL: https://issues.apache.org/jira/browse/HDDS-2211
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: test
Reporter: Attila Doroszlai
Assignee: Attila Doroszlai


Occasionally some acceptance test docker environment fails to start up 
properly.  We need docker logs for analysis, but they are not being collected.

https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14883) NPE when the second SNN is starting



 [ 
https://issues.apache.org/jira/browse/HDFS-14883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14883:
---
Labels: multi-sbnn  (was: )

> NPE when the second SNN is starting
> ---
>
> Key: HDFS-14883
> URL: https://issues.apache.org/jira/browse/HDFS-14883
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ranith Sardar
>Assignee: Ranith Sardar
>Priority: Major
>  Labels: multi-sbnn
>
>  
> {{| WARN | qtp79782883-47 | /imagetransfer | ServletHandler.java:632
>  java.io.IOException: PutImage failed. java.lang.NullPointerException
>  at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:198)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:485)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2187) ozone-mr test fails with No FileSystem for scheme "o3fs"



 [ 
https://issues.apache.org/jira/browse/HDDS-2187?focusedWorklogId=320734=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320734
 ]

ASF GitHub Bot logged work on HDDS-2187:


Author: ASF GitHub Bot
Created on: 30/Sep/19 18:27
Start Date: 30/Sep/19 18:27
Worklog Time Spent: 10m 
  Work Description: adoroszlai commented on issue #1537: HDDS-2187. 
ozone-mr test fails with No FileSystem for scheme o3fs
URL: https://github.com/apache/hadoop/pull/1537#issuecomment-536689992
 
 
   > datanodes in ozonesecure-mr are not started AFAIK
   > 
   > 
https://github.com/elek/ozone-ci-q4/blob/master/pr/pr-hdds-2187-2nl4x/acceptance/output.log
   
   I think it's `ozone-recon`, not `ozonesecure-mr`:
   
   
https://github.com/elek/ozone-ci-q4/blob/c4e58095c9584185c75d6c6af4721c33edba854a/pr/pr-hdds-2187-2nl4x/acceptance/output.log#L1453
   
   I've seen this problem intermittently in earlier runs, eg:
   
   
https://github.com/elek/ozone-ci-q4/blob/f700ebb2254527527960593496f15008245094d6/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768
   
   Working on a change to collect the docker logs.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320734)
Time Spent: 1.5h  (was: 1h 20m)

> ozone-mr test fails with No FileSystem for scheme "o3fs"
> 
>
> Key: HDDS-2187
> URL: https://issues.apache.org/jira/browse/HDDS-2187
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> HDDS-2101 changed how Ozone filesystem provider is configured.  {{ozone-mr}} 
> tests [started 
> failing|https://github.com/elek/ozone-ci/blob/2f2c99652af6b26a95f08eece9e545f0d72ccf45/pr/pr-hdds-2101-rtz55/acceptance/output.log#L255-L263],
>  but it [wasn't 
> noticed|https://github.com/elek/ozone-ci/blob/master/pr/pr-hdds-2101-rtz55/acceptance/result]
>  due to HDDS-2185.
> {code}
> Running command 'ozone fs -mkdir /user'
> ${output} = mkdir: No FileSystem for scheme "o3fs"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs

2019-09-30 Thread Mukul Kumar Singh (Jira)

Mukul Kumar Singh created HDFS-14884:


 Summary: Add sanity check that zone key equals feinfo key while 
setting Xattrs
 Key: HDFS-14884
 URL: https://issues.apache.org/jira/browse/HDFS-14884
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption, hdfs
Reporter: Mukul Kumar Singh


Currently, it is possible to set an external attribute where the  zone key is 
not the same as  feinfo key. This jira will add a precondition before setting 
this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941217#comment-16941217
 ] 

Konstantin Shvachko commented on HDFS-14305:


Hey [~hexiaoqiao], I don't think I understand what you mean.
The original bug was that the ranges are not disjoint, so they could cause 
collision of block tokens issued by different NameNodes. Both v06 and v07 
patches solve this problem.
We can still have a collision if we add new NameNodes to the cluster and 
restart them in arbitrary order. As I suggested we should try to solve this 
problem in a follow up jira. v06 patch introduced smaller ranges, so upgrading 
to this version will create collisions even if one keeps the number of 
NameNodes unchanged. v07 patch just fixes the arithmetic bug, and keeps the 
ranges as they were before. 
Hope this makes sense.


> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: Xiaoqiao He
>Priority: Major
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14305-007.patch, HDFS-14305.001.patch, 
> HDFS-14305.002.patch, HDFS-14305.003.patch, HDFS-14305.004.patch, 
> HDFS-14305.005.patch, HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13270) RBF: Router audit logger

2019-09-30 Thread hemanthboyina (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-13270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941215#comment-16941215
 ] 

hemanthboyina commented on HDFS-13270:
--

in the Namenode's implementation for most of the methods we get file status as 
return value
based on file status  object we get owner,permission values and form audit log 
accordingly
{code:java}
auditStat = FSDirMkdirOp.mkdirs(this, pc, src, permissions,
  . 
logAuditEvent(true, operationName, src, null, auditStat); {code}
that's not the case with Routers
{code:java}
public boolean mkdirs( 
return rpcClient.invokeSingle(firstLocation, method, Boolean.class); {code}
 

welcoming suggestions to implement , thanks 

> RBF: Router audit logger
> 
>
> Key: HDFS-13270
> URL: https://issues.apache.org/jira/browse/HDFS-13270
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Affects Versions: 3.2.0
>Reporter: maobaolong
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-13270.001.patch, HDFS-13270.002.patch, 
> HDFS-13270.003.patch
>
>
> We can use router auditlogger to log the client info and cmd, because the 
> FSNamesystem#Auditlogger's log think the client are all from router.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14498) LeaseManager can loop forever on the file for which create has failed

2019-09-30 Thread leigh (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941211#comment-16941211
 ] 

leigh commented on HDFS-14498:
--

Hi,

We also encountered this issue today:

Hadoop 3.2.0

Source code repository https://github.com/apache/hadoop.git -r 
e97acb3bd8f3befd27418996fa5d4b50bf2e17bf

Compiled by sunilg on 2019-01-08T06:08Z

Compiled with protoc 2.5.0

>From source with checksum d3f0795ed0d9dc378e2c785d3668f39

 

We did have an issue in our cluster before this happened where some of our DN 
stopped receiving heartbeats from the active NN (although heartbeats from the 
standby NN's could be seen).

I ran the recoverLease command on the bad files but that did not help.

I restarted the NN's and all the DN's. It stopped the spamming of the logs but 
we were still unable to write to the file. In the end I had to delete the bad 
files.

We have a sizeable cluster. Is there anything in particular you would like to 
see from the logs?

 

Thanks in advance.

> LeaseManager can loop forever on the file for which create has failed 
> --
>
> Key: HDFS-14498
> URL: https://issues.apache.org/jira/browse/HDFS-14498
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Sergey Shelukhin
>Priority: Major
>
> The logs from file creation are long gone due to infinite lease logging, 
> however it presumably failed... the client who was trying to write this file 
> is definitely long dead.
> The version includes HDFS-4882.
> We get this log pattern repeating infinitely:
> {noformat}
> 2019-05-16 14:00:16,893 INFO 
> [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: [Lease.  Holder: 
> DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 1] has expired hard 
> limit
> 2019-05-16 14:00:16,893 INFO 
> [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease.  
> Holder: DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 1], src=
> 2019-05-16 14:00:16,893 WARN 
> [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] 
> org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.internalReleaseLease: 
> Failed to release lease for file . Committed blocks are waiting to be 
> minimally replicated. Try again later.
> 2019-05-16 14:00:16,893 WARN 
> [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Cannot release the path 
>  in the lease [Lease.  Holder: DFSClient_NONMAPREDUCE_-20898906_61, 
> pending creates: 1]. It will be retried.
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: DIR* 
> NameSystem.internalReleaseLease: Failed to release lease for file . 
> Committed blocks are waiting to be minimally replicated. Try again later.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:3357)
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.checkLeases(LeaseManager.java:573)
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:509)
>   at java.lang.Thread.run(Thread.java:745)
> $  grep -c "Recovering.*DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 
> 1" hdfs_nn*
> hdfs_nn.log:1068035
> hdfs_nn.log.2019-05-16-14:1516179
> hdfs_nn.log.2019-05-16-15:1538350
> {noformat}
> Aside from an actual bug fix, it might make sense to make LeaseManager not 
> log so much, in case if there are more bugs like this...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14495) RBF: Duplicate FederationRPCMetrics

2019-09-30 Thread hemanthboyina (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941198#comment-16941198
 ] 

hemanthboyina commented on HDFS-14495:
--

interfaces RouterRpcMonitor and FederationRPCMBean are having most of the 
metrics same 

can we remove the dependency and duplication by removing either one of these

> RBF: Duplicate FederationRPCMetrics
> ---
>
> Key: HDFS-14495
> URL: https://issues.apache.org/jira/browse/HDFS-14495
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: metrics
>Reporter: Akira Ajisaka
>Assignee: hemanthboyina
>Priority: Major
>
> There are two FederationRPCMetrics displayed in Web UI (http:// hostname>:/jmx) and most of the metrics are the same.
> * FederationRPCMetrics via {{@Metrics}} and {{@Metric}} annotations
> * FederationRPCMetrics via registering FederationRPCMBean
> Can we remove {{@Metrics}} and {{@Metric}} annotations to remove duplication?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2187) ozone-mr test fails with No FileSystem for scheme "o3fs"



 [ 
https://issues.apache.org/jira/browse/HDDS-2187?focusedWorklogId=320715=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320715
 ]

ASF GitHub Bot logged work on HDDS-2187:


Author: ASF GitHub Bot
Created on: 30/Sep/19 17:58
Start Date: 30/Sep/19 17:58
Worklog Time Spent: 10m 
  Work Description: elek commented on issue #1537: HDDS-2187. ozone-mr test 
fails with No FileSystem for scheme o3fs
URL: https://github.com/apache/hadoop/pull/1537#issuecomment-536677464
 
 
   Thanks the update @adoroszlai Looks good. Good go have `ozone fs` fixed
   
   One problem:
   datanodes in ozonesecure-mr are not started AFAIK
   
   
https://github.com/elek/ozone-ci-q4/blob/master/pr/pr-hdds-2187-2nl4x/acceptance/output.log
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320715)
Time Spent: 1h 20m  (was: 1h 10m)

> ozone-mr test fails with No FileSystem for scheme "o3fs"
> 
>
> Key: HDDS-2187
> URL: https://issues.apache.org/jira/browse/HDDS-2187
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> HDDS-2101 changed how Ozone filesystem provider is configured.  {{ozone-mr}} 
> tests [started 
> failing|https://github.com/elek/ozone-ci/blob/2f2c99652af6b26a95f08eece9e545f0d72ccf45/pr/pr-hdds-2101-rtz55/acceptance/output.log#L255-L263],
>  but it [wasn't 
> noticed|https://github.com/elek/ozone-ci/blob/master/pr/pr-hdds-2101-rtz55/acceptance/result]
>  due to HDDS-2185.
> {code}
> Running command 'ozone fs -mkdir /user'
> ${output} = mkdir: No FileSystem for scheme "o3fs"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception

2019-09-30 Thread Shashikant Banerjee (Jira)

Shashikant Banerjee created HDDS-2210:
-

 Summary: ContainerStateMachine should not be marked unhealthy if 
applyTransaction fails with closed container exception
 Key: HDDS-2210
 URL: https://issues.apache.org/jira/browse/HDDS-2210
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Datanode
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee
 Fix For: 0.5.0


Currently, if applyTransaction fails, the stateMachine is marked unhealthy and 
next snapshot creation will fail. As a result of which the the raftServer will 
close down leading to pipeline failure. ClosedContainer exception should be 
ignored while marking the stateMachine unhealthy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1720) Add ability to configure RocksDB logs for Ozone Manager



 [ 
https://issues.apache.org/jira/browse/HDDS-1720?focusedWorklogId=320708=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320708
 ]

ASF GitHub Bot logged work on HDDS-1720:


Author: ASF GitHub Bot
Created on: 30/Sep/19 17:52
Start Date: 30/Sep/19 17:52
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on issue #1538: HDDS-1720 : Add 
ability to configure RocksDB logs for Ozone Manager.
URL: https://github.com/apache/hadoop/pull/1538#issuecomment-536674819
 
 
   /retest
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320708)
Time Spent: 40m  (was: 0.5h)

> Add ability to configure RocksDB logs for Ozone Manager
> ---
>
> Key: HDDS-1720
> URL: https://issues.apache.org/jira/browse/HDDS-1720
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> While doing performance testing, it was seen that there was no way to get 
> RocksDB logs for Ozone Manager. Along with Rocksdb metrics, this may be a 
> useful mechanism to understand the health of Rocksdb while investigating 
> large clusters. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14495) RBF: Duplicate FederationRPCMetrics

2019-09-30 Thread hemanthboyina (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941186#comment-16941186
 ] 

hemanthboyina commented on HDFS-14495:
--

hi [~elgoiri] any suggestions for this ?

> RBF: Duplicate FederationRPCMetrics
> ---
>
> Key: HDFS-14495
> URL: https://issues.apache.org/jira/browse/HDFS-14495
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: metrics
>Reporter: Akira Ajisaka
>Assignee: hemanthboyina
>Priority: Major
>
> There are two FederationRPCMetrics displayed in Web UI (http:// hostname>:/jmx) and most of the metrics are the same.
> * FederationRPCMetrics via {{@Metrics}} and {{@Metric}} annotations
> * FederationRPCMetrics via registering FederationRPCMBean
> Can we remove {{@Metrics}} and {{@Metric}} annotations to remove duplication?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis



[ 
https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941170#comment-16941170
 ] 

Hadoop QA commented on HDDS-2169:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 37m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
36s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
35s{color} | {color:red} hadoop-hdds in trunk failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
38s{color} | {color:red} hadoop-ozone in trunk failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-hdds in trunk failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
13s{color} | {color:red} hadoop-ozone in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-hdds in trunk failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
17s{color} | {color:red} hadoop-ozone in trunk failed. {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 17m  
8s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
30s{color} | {color:red} hadoop-hdds in trunk failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
17s{color} | {color:red} hadoop-ozone in trunk failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
32s{color} | {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
34s{color} | {color:red} hadoop-ozone in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
15s{color} | {color:red} hadoop-ozone in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 22s{color} 
| {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 15s{color} 
| {color:red} hadoop-ozone in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} hadoop-hdds: The patch generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
16s{color} |

[jira] [Work logged] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis



 [ 
https://issues.apache.org/jira/browse/HDDS-2169?focusedWorklogId=320700=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320700
 ]

ASF GitHub Bot logged work on HDDS-2169:


Author: ASF GitHub Bot
Created on: 30/Sep/19 17:37
Start Date: 30/Sep/19 17:37
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1517: HDDS-2169
URL: https://github.com/apache/hadoop/pull/1517#issuecomment-536668387
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 2244 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | -1 | test4tests | 0 | The patch doesn't appear to include any new or 
modified tests.  Please justify why no new tests are needed for this patch. 
Also please list what manual steps were performed to verify this patch. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 36 | Maven dependency ordering for branch |
   | -1 | mvninstall | 35 | hadoop-hdds in trunk failed. |
   | -1 | mvninstall | 38 | hadoop-ozone in trunk failed. |
   | -1 | compile | 19 | hadoop-hdds in trunk failed. |
   | -1 | compile | 13 | hadoop-ozone in trunk failed. |
   | +1 | checkstyle | 59 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 940 | branch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 19 | hadoop-hdds in trunk failed. |
   | -1 | javadoc | 17 | hadoop-ozone in trunk failed. |
   | 0 | spotbugs | 1028 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | -1 | findbugs | 30 | hadoop-hdds in trunk failed. |
   | -1 | findbugs | 17 | hadoop-ozone in trunk failed. |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 15 | Maven dependency ordering for patch |
   | -1 | mvninstall | 32 | hadoop-hdds in the patch failed. |
   | -1 | mvninstall | 34 | hadoop-ozone in the patch failed. |
   | -1 | compile | 22 | hadoop-hdds in the patch failed. |
   | -1 | compile | 15 | hadoop-ozone in the patch failed. |
   | -1 | javac | 22 | hadoop-hdds in the patch failed. |
   | -1 | javac | 15 | hadoop-ozone in the patch failed. |
   | -0 | checkstyle | 25 | hadoop-hdds: The patch generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 789 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 19 | hadoop-hdds in the patch failed. |
   | -1 | javadoc | 16 | hadoop-ozone in the patch failed. |
   | -1 | findbugs | 29 | hadoop-hdds in the patch failed. |
   | -1 | findbugs | 17 | hadoop-ozone in the patch failed. |
   ||| _ Other Tests _ |
   | -1 | unit | 25 | hadoop-hdds in the patch failed. |
   | -1 | unit | 22 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 30 | The patch does not generate ASF License warnings. |
   | | | 4686 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.0 Server=19.03.0 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/Dockerfile
 |
   | JIRA Issue | HDDS-2169 |
   | JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12981115/o2169_20190923.patch |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 11b7ced0c5c4 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 98ca07e |
   | Default Java | 1.8.0_222 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-mvninstall-hadoop-hdds.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-mvninstall-hadoop-ozone.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-compile-hadoop-hdds.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-compile-hadoop-ozone.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-javadoc-hadoop-hdds.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-javadoc-hadoop-ozone.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-findbugs-hadoop-hdds.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-findbugs-hadoop-ozone.txt
 |
   |

[jira] [Assigned] (HDDS-1984) Fix listBucket API



 [ 
https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YiSheng Lien reassigned HDDS-1984:
--

Assignee: YiSheng Lien

> Fix listBucket API
> --
>
> Key: HDDS-1984
> URL: https://issues.apache.org/jira/browse/HDDS-1984
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: YiSheng Lien
>Priority: Major
>
> This Jira is to fix listBucket API in HA code path.
> In HA, we have an in-memory cache, where we put the result to in-memory cache 
> and return the response, later it will be picked by double buffer thread and 
> it will flush to disk. So, now when do listBuckets, it should use both 
> in-memory cache and rocksdb bucket table to list buckets in a volume.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-1984) Fix listBucket API