date:20180803

[jira] [Commented] (HDDS-298) Implement SCMClientProtocolServer.getContainerWithPipeline for closed containers

2018-08-03 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569042#comment-16569042
 ] 

genericqa commented on HDDS-298:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  9s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
36s{color} | {color:red} hadoop-hdds/server-scm in trunk has 1 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  1m 24s{color} 
| {color:red} server-scm in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdds.scm.container.TestContainerMapping |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDDS-298 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934372/HDDS-298.04.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 176fdd47217c 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / bcfc985 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDDS-Build/692/artifact/out/branch-findbugs-hadoop-hdds_server-scm-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDDS-Build/692/artifact/out/patch-unit-hadoop-hdds_server-scm.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDDS-Build/692/testReport/ |
| Max. process+thread count | 410 (vs. ulimit of 1) |
| modules | C: hadoop-hdds/server-scm U: hadoop-hdds/server-scm |
| Console ou

[jira] [Commented] (HDDS-298) Implement SCMClientProtocolServer.getContainerWithPipeline for closed containers

2018-08-03 Thread Ajay Kumar (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569022#comment-16569022
 ] 

Ajay Kumar commented on HDDS-298:
-

[~nandakumar131] , thanks for review. Patch v4 throws SCM IO exception in place 
of Precondition.

> Implement SCMClientProtocolServer.getContainerWithPipeline for closed 
> containers
> 
>
> Key: HDDS-298
> URL: https://issues.apache.org/jira/browse/HDDS-298
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Ajay Kumar
>Priority: Critical
> Fix For: 0.2.1
>
> Attachments: HDDS-298.00.patch, HDDS-298.01.patch, HDDS-298.02.patch, 
> HDDS-298.03.patch, HDDS-298.04.patch
>
>
> As [~ljain] mentioned during the review of HDDS-245 
> SCMClientProtocolServer.getContainerWithPipeline doesn't return with good 
> data for closed containers. For closed containers we are maintaining the 
> datanodes for a containerId in the ContainerStateMap.contReplicaMap. We need 
> to create fake Pipeline object on-request and return it for the client to 
> locate the right datanodes to download data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-298) Implement SCMClientProtocolServer.getContainerWithPipeline for closed containers

2018-08-03 Thread Ajay Kumar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-298:

Attachment: HDDS-298.04.patch

> Implement SCMClientProtocolServer.getContainerWithPipeline for closed 
> containers
> 
>
> Key: HDDS-298
> URL: https://issues.apache.org/jira/browse/HDDS-298
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Ajay Kumar
>Priority: Critical
> Fix For: 0.2.1
>
> Attachments: HDDS-298.00.patch, HDDS-298.01.patch, HDDS-298.02.patch, 
> HDDS-298.03.patch, HDDS-298.04.patch
>
>
> As [~ljain] mentioned during the review of HDDS-245 
> SCMClientProtocolServer.getContainerWithPipeline doesn't return with good 
> data for closed containers. For closed containers we are maintaining the 
> datanodes for a containerId in the ContainerStateMap.contReplicaMap. We need 
> to create fake Pipeline object on-request and return it for the client to 
> locate the right datanodes to download data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-11610) sun.net.spi.nameservice.NameService has moved to a new location

2018-08-03 Thread Takanobu Asanuma (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-11610:

Fix Version/s: 3.2.0

> sun.net.spi.nameservice.NameService has moved to a new location
> ---
>
> Key: HDFS-11610
> URL: https://issues.apache.org/jira/browse/HDFS-11610
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HDFS-11610.001.patch, HDFS-11610.002.patch
>
>
> sun.net.spi.nameservice.NameService was moved to 
> java.net.InetAddress$NameService in Java 9. TestDFSClientFailover uses this 
> class to spy nameservice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-60) mvn site:site -Phdds does not work

2018-08-03 Thread Xiaoyu Yao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao resolved HDDS-60.

Resolution: Not A Problem

Thanks [~elek] for looking into this.

After "mvn clean install", the ~/.m2/ got rebuild and the issue does not repro 
any more.

> mvn site:site -Phdds does not work
> --
>
> Key: HDDS-60
> URL: https://issues.apache.org/jira/browse/HDDS-60
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Fix For: 0.2.1
>
>
> {code}[*INFO*] *Reactor Summary:*
> [*INFO*] 
> [*INFO*] Apache Hadoop Main . *FAILURE* [ 
> 47.386 s]
> ...
> [*ERROR*] Failed to execute goal 
> org.apache.maven.plugins:maven-site-plugin:3.6:site *(default-cli)* on 
> project hadoop-main: *failed to get report for 
> org.apache.maven.plugins:maven-javadoc-plugin*: Failed to execute goal on 
> project hadoop-hdds-server-scm: *Could not resolve dependencies for project 
> org.apache.hadoop:hadoop-hdds-server-scm:jar:3.2.0-SNAPSHOT: Could not find 
> artifact 
> org.apache.hadoop:hadoop-hdds-container-service:jar:tests:3.2.0-SNAPSHOT in 
> public (http://nexus-private.hortonworks.com/nexus/content/groups/public)* -> 
> [Help 1]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work stopped] (HDDS-60) mvn site:site -Phdds does not work

2018-08-03 Thread Xiaoyu Yao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDDS-60 stopped by Xiaoyu Yao.
--
> mvn site:site -Phdds does not work
> --
>
> Key: HDDS-60
> URL: https://issues.apache.org/jira/browse/HDDS-60
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Fix For: 0.2.1
>
>
> {code}[*INFO*] *Reactor Summary:*
> [*INFO*] 
> [*INFO*] Apache Hadoop Main . *FAILURE* [ 
> 47.386 s]
> ...
> [*ERROR*] Failed to execute goal 
> org.apache.maven.plugins:maven-site-plugin:3.6:site *(default-cli)* on 
> project hadoop-main: *failed to get report for 
> org.apache.maven.plugins:maven-javadoc-plugin*: Failed to execute goal on 
> project hadoop-hdds-server-scm: *Could not resolve dependencies for project 
> org.apache.hadoop:hadoop-hdds-server-scm:jar:3.2.0-SNAPSHOT: Could not find 
> artifact 
> org.apache.hadoop:hadoop-hdds-container-service:jar:tests:3.2.0-SNAPSHOT in 
> public (http://nexus-private.hortonworks.com/nexus/content/groups/public)* -> 
> [Help 1]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work started] (HDDS-60) mvn site:site -Phdds does not work

2018-08-03 Thread Xiaoyu Yao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDDS-60 started by Xiaoyu Yao.
--
> mvn site:site -Phdds does not work
> --
>
> Key: HDDS-60
> URL: https://issues.apache.org/jira/browse/HDDS-60
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Fix For: 0.2.1
>
>
> {code}[*INFO*] *Reactor Summary:*
> [*INFO*] 
> [*INFO*] Apache Hadoop Main . *FAILURE* [ 
> 47.386 s]
> ...
> [*ERROR*] Failed to execute goal 
> org.apache.maven.plugins:maven-site-plugin:3.6:site *(default-cli)* on 
> project hadoop-main: *failed to get report for 
> org.apache.maven.plugins:maven-javadoc-plugin*: Failed to execute goal on 
> project hadoop-hdds-server-scm: *Could not resolve dependencies for project 
> org.apache.hadoop:hadoop-hdds-server-scm:jar:3.2.0-SNAPSHOT: Could not find 
> artifact 
> org.apache.hadoop:hadoop-hdds-container-service:jar:tests:3.2.0-SNAPSHOT in 
> public (http://nexus-private.hortonworks.com/nexus/content/groups/public)* -> 
> [Help 1]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-267) Handle consistency issues during container update/close

2018-08-03 Thread Bharat Viswanadham (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568883#comment-16568883
 ] 

Bharat Viswanadham commented on HDDS-267:
-

Hi [~hanishakoneru]

Patch is not applying on top of trunk. Sorry, missed reviewing these changes. 
It needs to rebase.

> Handle consistency issues during container update/close
> ---
>
> Key: HDDS-267
> URL: https://issues.apache.org/jira/browse/HDDS-267
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-267.001.patch, HDDS-267.002.patch, 
> HDDS-267.003.patch
>
>
> During container update and close, the .container file on disk is modified. 
> We should make sure that the in-memory state and the on-disk state for a 
> container are consistent. 
> A write lock is obtained before updating the container data during close or 
> update operations.
> During update operation, if the on-disk update of .container file fails, then 
> the container metadata is in-memory is also reset to the old value.
> During close operation, if the on-disk update of .container file fails, then 
> the in-memory containerState is set to CLOSING so that no new operations are 
> permitted. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-323) Rename Storage Containers

2018-08-03 Thread Arpit Agarwal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reassigned HDDS-323:
--

Assignee: Arpit Agarwal

> Rename Storage Containers
> -
>
> Key: HDDS-323
> URL: https://issues.apache.org/jira/browse/HDDS-323
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>Priority: Major
> Fix For: 0.2.1
>
>
> The term container is heavily overloaded and easy to confuse with yarn/Linux 
> containers.
> I propose renaming _*containers*_ to _*bins*_. Am very much open to better 
> suggestions though.
> This also means that SCM (Storage Container Manager) gets renamed to SBM 
> (Storage Bin Manager).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-323) Rename Storage Containers

2018-08-03 Thread Arpit Agarwal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDDS-323:
---
Description: 
The term container is heavily overloaded and easy to confuse with yarn/Linux 
containers.

I propose renaming _*containers*_ to _*bins*_. Am very much open to better 
suggestions though.

This also means that SCM (Storage Container Manager) gets renamed to SBM 
(Storage Bin Manager).

  was:
The term container is heavily overloaded and easy to confuse with yarn/Linux 
containers.

 


> Rename Storage Containers
> -
>
> Key: HDDS-323
> URL: https://issues.apache.org/jira/browse/HDDS-323
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Arpit Agarwal
>Priority: Major
> Fix For: 0.2.1
>
>
> The term container is heavily overloaded and easy to confuse with yarn/Linux 
> containers.
> I propose renaming _*containers*_ to _*bins*_. Am very much open to better 
> suggestions though.
> This also means that SCM (Storage Container Manager) gets renamed to SBM 
> (Storage Bin Manager).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-323) Rename Storage Containers

2018-08-03 Thread Arpit Agarwal (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568880#comment-16568880
 ] 

Arpit Agarwal commented on HDDS-323:


We should do this once the SCM refactoring work is completed and ideally before 
the first Ozone Alpha release.

> Rename Storage Containers
> -
>
> Key: HDDS-323
> URL: https://issues.apache.org/jira/browse/HDDS-323
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Arpit Agarwal
>Priority: Major
> Fix For: 0.2.1
>
>
> The term container is heavily overloaded and easy to confuse with yarn/Linux 
> containers.
> I propose renaming _*containers*_ to _*bins*_. Am very much open to better 
> suggestions though.
> This also means that SCM (Storage Container Manager) gets renamed to SBM 
> (Storage Bin Manager).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-323) Rename Storage Containers

2018-08-03 Thread Arpit Agarwal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDDS-323:
---
Fix Version/s: 0.2.1

> Rename Storage Containers
> -
>
> Key: HDDS-323
> URL: https://issues.apache.org/jira/browse/HDDS-323
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Arpit Agarwal
>Priority: Major
> Fix For: 0.2.1
>
>
> The term container is heavily overloaded and easy to confuse with yarn/Linux 
> containers.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13767) Add msync server implementation.

2018-08-03 Thread Plamen Jeliazkov (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568879#comment-16568879
 ] 

Plamen Jeliazkov commented on HDFS-13767:
-

I like the mixed strategy / optimization. Another reason for throwing the 
exception could be if the ObserverNode already has too many queued up as well. 
Imagine like if maybe if tailing slowed down for some reason or the node's 
processing slowed? -- unlikely scenario but hey, it'd be a way to put in some 
easy back pressure. 

> Add msync server implementation.
> 
>
> Key: HDFS-13767
> URL: https://issues.apache.org/jira/browse/HDFS-13767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13767-HDFS-12943.001.patch, 
> HDFS-13767-HDFS-12943.002.patch, HDFS-13767.WIP.001.patch, 
> HDFS-13767.WIP.002.patch, HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-286) Fix NodeReportPublisher.getReport NPE

2018-08-03 Thread Xiaoyu Yao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao resolved HDDS-286.
-
Resolution: Cannot Reproduce

> Fix NodeReportPublisher.getReport NPE
> -
>
> Key: HDDS-286
> URL: https://issues.apache.org/jira/browse/HDDS-286
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Junjie Chen
>Priority: Major
>  Labels: newbie
> Fix For: 0.2.1
>
>
> This can be reproed with TestKeys#testPutKey
> {code}
> 2018-07-23 21:33:55,598 WARN  concurrent.ExecutorHelper 
> (ExecutorHelper.java:logThrowableFromAfterExecute(63)) - Caught exception in 
> thread Datanode ReportManager Thread - 0: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.ozone.container.common.volume.VolumeInfo.getScmUsed(VolumeInfo.java:107)
>   at 
> org.apache.hadoop.ozone.container.common.volume.VolumeSet.getNodeReport(VolumeSet.java:350)
>   at 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getNodeReport(OzoneContainer.java:260)
>   at 
> org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:64)
>   at 
> org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:39)
>   at 
> org.apache.hadoop.ozone.container.common.report.ReportPublisher.publishReport(ReportPublisher.java:86)
>   at 
> org.apache.hadoop.ozone.container.common.report.ReportPublisher.run(ReportPublisher.java:73)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-286) Fix NodeReportPublisher.getReport NPE

2018-08-03 Thread Xiaoyu Yao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568877#comment-16568877
 ] 

Xiaoyu Yao edited comment on HDDS-286 at 8/3/18 10:37 PM:
--

Thanks [~junjie] for looking into this. It does not repro on latest trunk any 
more. Close as not repro


was (Author: xyao):
It does not repro on latest trunk any more. Close as not repro

> Fix NodeReportPublisher.getReport NPE
> -
>
> Key: HDDS-286
> URL: https://issues.apache.org/jira/browse/HDDS-286
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Junjie Chen
>Priority: Major
>  Labels: newbie
> Fix For: 0.2.1
>
>
> This can be reproed with TestKeys#testPutKey
> {code}
> 2018-07-23 21:33:55,598 WARN  concurrent.ExecutorHelper 
> (ExecutorHelper.java:logThrowableFromAfterExecute(63)) - Caught exception in 
> thread Datanode ReportManager Thread - 0: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.ozone.container.common.volume.VolumeInfo.getScmUsed(VolumeInfo.java:107)
>   at 
> org.apache.hadoop.ozone.container.common.volume.VolumeSet.getNodeReport(VolumeSet.java:350)
>   at 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getNodeReport(OzoneContainer.java:260)
>   at 
> org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:64)
>   at 
> org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:39)
>   at 
> org.apache.hadoop.ozone.container.common.report.ReportPublisher.publishReport(ReportPublisher.java:86)
>   at 
> org.apache.hadoop.ozone.container.common.report.ReportPublisher.run(ReportPublisher.java:73)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-286) Fix NodeReportPublisher.getReport NPE

2018-08-03 Thread Xiaoyu Yao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568877#comment-16568877
 ] 

Xiaoyu Yao commented on HDDS-286:
-

It does not repro on latest trunk any more. Close as not repro

> Fix NodeReportPublisher.getReport NPE
> -
>
> Key: HDDS-286
> URL: https://issues.apache.org/jira/browse/HDDS-286
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Junjie Chen
>Priority: Major
>  Labels: newbie
> Fix For: 0.2.1
>
>
> This can be reproed with TestKeys#testPutKey
> {code}
> 2018-07-23 21:33:55,598 WARN  concurrent.ExecutorHelper 
> (ExecutorHelper.java:logThrowableFromAfterExecute(63)) - Caught exception in 
> thread Datanode ReportManager Thread - 0: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.ozone.container.common.volume.VolumeInfo.getScmUsed(VolumeInfo.java:107)
>   at 
> org.apache.hadoop.ozone.container.common.volume.VolumeSet.getNodeReport(VolumeSet.java:350)
>   at 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getNodeReport(OzoneContainer.java:260)
>   at 
> org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:64)
>   at 
> org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:39)
>   at 
> org.apache.hadoop.ozone.container.common.report.ReportPublisher.publishReport(ReportPublisher.java:86)
>   at 
> org.apache.hadoop.ozone.container.common.report.ReportPublisher.run(ReportPublisher.java:73)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-03 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568875#comment-16568875
 ] 

genericqa commented on HDFS-13792:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
38m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 25s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13792 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934332/HDFS-13792.000.patch |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 8e1278cbe714 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2b18bb4 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 305 (vs. ulimit of 1) |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24698/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fix FSN read/write lock metrics name
> 
>
> Key: HDFS-13792
> URL: https://issues.apache.org/jira/browse/HDFS-13792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Trivial
> Attachments: HDFS-13792.000.patch
>
>
> The metrics name for FSN read/write lock should be in the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NanosNumOps
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12976) Introduce ObserverReadProxyProvider

2018-08-03 Thread Erik Krogen (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-12976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568861#comment-16568861
 ] 

Erik Krogen commented on HDFS-12976:


Hey [~csun], one follow-up question about these two lines:
{code}
alignmentContext = new ClientGSIContext();
((ClientHAProxyFactory) factory).setAlignmentContext(alignmentContext);
{code}
Is there any danger of overwriting another proxy's AlignmentContext with these 
lines? Will the same factory be re-used across multiple clients? I am wondering 
if it would be more appropriate to check if the factory already has an 
AlignmentContext, and if so re-use that one instead of creating a new one.

> Introduce ObserverReadProxyProvider
> ---
>
> Key: HDFS-12976
> URL: https://issues.apache.org/jira/browse/HDFS-12976
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
> Fix For: HDFS-12943
>
> Attachments: HDFS-12976-HDFS-12943.000.patch, 
> HDFS-12976-HDFS-12943.001.patch, HDFS-12976-HDFS-12943.002.patch, 
> HDFS-12976-HDFS-12943.003.patch, HDFS-12976-HDFS-12943.004.patch, 
> HDFS-12976-HDFS-12943.005.patch, HDFS-12976-HDFS-12943.006.patch, 
> HDFS-12976-HDFS-12943.007.patch, HDFS-12976-HDFS-12943.008.patch, 
> HDFS-12976.WIP.patch
>
>
> {{StandbyReadProxyProvider}} should implement {{FailoverProxyProvider}} 
> interface and be able to submit read requests to ANN and SBN(s).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13767) Add msync server implementation.

2018-08-03 Thread Chen Liang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568860#comment-16568860
 ] 

Chen Liang commented on HDFS-13767:
---

Thanks [~zero45]. I updated the previous comment, which I should probably just 
put new comments. Did not realize you would be looking at it so quickly :).

Like I was mentioning in the above comments, we did have some offline 
discussion on that, client could just abandon an observer if it finds it too 
far behind and try another one. We can file a JIRA and follow up on that.

> Add msync server implementation.
> 
>
> Key: HDFS-13767
> URL: https://issues.apache.org/jira/browse/HDFS-13767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13767-HDFS-12943.001.patch, 
> HDFS-13767-HDFS-12943.002.patch, HDFS-13767.WIP.001.patch, 
> HDFS-13767.WIP.002.patch, HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-13767) Add msync server implementation.

2018-08-03 Thread Chen Liang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568843#comment-16568843
 ] 

Chen Liang edited comment on HDFS-13767 at 8/3/18 10:21 PM:


Thanks for comments [~zero45].

bq. (2) A concern about the unit test...
Thanks for trying this. I think it might be related to the recent update that 
enabled in-progress edit tail. Because if I disable in-progress edit tail for 
this test, then add the delay you mentioned, the test will succeed, but took 
longer (expected). I will need to look deeper to see how in-progress edit tail 
worked with this. [~xkrogen] any fast insights on this?

bq. (3) So far these patches don't actually ...
The only reason there is msync in {{NameNodeRpcServer}} is because msync is 
part of {{ClientProtocol}}, and NameNodeRpcServer implements it. So changing 
this in NameNodeRpcServer won't compile. I guess msync is a little bit 
different from other RPC calls, in that from client side, it's another RPC call 
like others, but from server side, all it's logic is handled in RPC layer, not 
in NN layer at all.

bq. (4) I was curious about making msync pause ...
Just some thoughts of mine: it is true that there will be no deferred call in 
this case, but seems to me this will also add retry msync calls, so I don't see 
a particular advantage here in terms of call queue load. It seems to me it is 
actually the same as deferring the call, the only difference is whether let the 
client send another call, or just server viewing the existing call as a new one 
(by putting it back to queue).

Actually, maybe we can have a mixed strategy, we did have some discussion 
offline, that if server sees itself too far behind when checking the call, 
don't re-queue, but just throw exception so that client don't bother to wait 
and can try a different Observer. I viewed this as an optimization. 


was (Author: vagarychen):
Thanks for comments [~zero45].

bq. (2) A concern about the unit test...
Thanks for trying this. I think it might be related to the recent update that 
enabled in-progress edit tail. Because if I disable in-progress edit tail for 
this test, then add the delay you mentioned, the test will succeed, but took 
longer (expected). I will need to look deeper to see how in-progress edit tail 
worked with this. [~xkrogen] any fast insights on this?

bq. (3) So far these patches don't actually ...
The only reason there is msync in {{NameNodeRpcServer}} is because msync is 
part of {{ClientProtocol}}, and NameNodeRpcServer implements it. So changing 
this in NameNodeRpcServer won't compile. I guess msync is a little bit 
different from other RPC calls, in that from client side, it's another RPC call 
like others, but from server side, all it's logic is handled in RPC layer, not 
in NN layer at all.

> Add msync server implementation.
> 
>
> Key: HDFS-13767
> URL: https://issues.apache.org/jira/browse/HDFS-13767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13767-HDFS-12943.001.patch, 
> HDFS-13767-HDFS-12943.002.patch, HDFS-13767.WIP.001.patch, 
> HDFS-13767.WIP.002.patch, HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13767) Add msync server implementation.

2018-08-03 Thread Plamen Jeliazkov (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568851#comment-16568851
 ] 

Plamen Jeliazkov commented on HDFS-13767:
-

Thanks for the responses, [~vagarychen]!

Thank you I think I understand (2) and (3) now. 

Regarding (4) -- Thanks for pointing out it could be worse. I understand where 
you are coming from. Do you think we could end up hitting that scenario anyway 
in the case of a slow / dying / dead ObserverNode? The client would just end up 
calling _msync()_ on another node, no? I agree that we would just end up going 
to a heuristic (configuration of some timeout interval). I also suppose if we 
had enough healthy ObserverNodes then its may not really be a concern to have a 
bunch of calls queue'd if we could actually support holding them all across 
hosts vs spontaneous client network traffic trying to find a "good" host.



> Add msync server implementation.
> 
>
> Key: HDFS-13767
> URL: https://issues.apache.org/jira/browse/HDFS-13767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13767-HDFS-12943.001.patch, 
> HDFS-13767-HDFS-12943.002.patch, HDFS-13767.WIP.001.patch, 
> HDFS-13767.WIP.002.patch, HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-13767) Add msync server implementation.

2018-08-03 Thread Chen Liang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568843#comment-16568843
 ] 

Chen Liang edited comment on HDFS-13767 at 8/3/18 10:12 PM:


Thanks for comments [~zero45].

bq. (2) A concern about the unit test...
Thanks for trying this. I think it might be related to the recent update that 
enabled in-progress edit tail. Because if I disable in-progress edit tail for 
this test, then add the delay you mentioned, the test will succeed, but took 
longer (expected). I will need to look deeper to see how in-progress edit tail 
worked with this. [~xkrogen] any fast insights on this?

bq. (3) So far these patches don't actually ...
The only reason there is msync in {{NameNodeRpcServer}} is because msync is 
part of {{ClientProtocol}}, and NameNodeRpcServer implements it. So changing 
this in NameNodeRpcServer won't compile. I guess msync is a little bit 
different from other RPC calls, in that from client side, it's another RPC call 
like others, but from server side, all it's logic is handled in RPC layer, not 
in NN layer at all.


was (Author: vagarychen):
Thanks for comments [~zero45].

bq. (2) A concern about the unit test...
Thanks for trying this. I think it might be related to the recent update that 
enabled in-progress edit tail. Because if I disable in-progress edit tail for 
this test, then add the delay you mentioned, the test will succeed, but took 
longer (expected). I will need to look deeper to see how in-progress edit tail 
worked with this. [~xkrogen] any fast insights on this?

bq. (3) So far these patches don't actually ...
The only reason there is msync in {{NameNodeRpcServer}} is because msync is 
part of {{ClientProtocol}}, and NameNodeRpcServer implements it. So changing 
this in NameNodeRpcServer won't compile. I guess msync is a little bit 
different from other RPC calls, in that from client side, it's another RPC call 
like others, but from server side, all it's logic is handled in RPC layer, not 
in NN layer at all.

bq. (4) I was curious about making msync pause ...
Just some thoughts of mine: it is true that there will be no deferred call in 
this case, but seems to me this will also add retry msync calls, so I don't see 
a particular advantage here in terms of call queue load. Actually it maybe even 
worse: if client times out and retries, the previous call could still be in the 
server call queue? In which case we could have multiple msync call in the 
server call queue, many clients doing this msync retry could fill up the queue 
unnecessarily.  While if the client retry interval is long, it can easily be 
just wasting time. 

> Add msync server implementation.
> 
>
> Key: HDFS-13767
> URL: https://issues.apache.org/jira/browse/HDFS-13767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13767-HDFS-12943.001.patch, 
> HDFS-13767-HDFS-12943.002.patch, HDFS-13767.WIP.001.patch, 
> HDFS-13767.WIP.002.patch, HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-230) ContainerStateMachine should provide readStateMachineData api to read data if Containers with required during replication

2018-08-03 Thread Tsz Wo Nicholas Sze (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568847#comment-16568847
 ] 

Tsz Wo Nicholas Sze commented on HDDS-230:
--

- In readStateMachineData, we should use CompletableFuture.supplyAsync(..) 
instead of CompletableFuture.completedFuture(..).  Otherwise, it is not async.
- We should not to build the intermediate protos -- just pass the builders.

The code should look like below:
{code}
  private LogEntryProto readStateMachineData(SMLogEntryProto smLogEntryProto, 
ContainerCommandRequestProto requestProto) {
WriteChunkRequestProto writeChunkRequestProto =
requestProto.getWriteChunk();
// Assert that store log entry is for COMMIT_DATA, the WRITE_DATA is
// written through writeStateMachineData.
Preconditions.checkArgument(writeChunkRequestProto.getStage()
== Stage.COMMIT_DATA);

// prepare the chunk to be read
ReadChunkRequestProto.Builder readChunkRequestProto =
ReadChunkRequestProto.newBuilder()
.setBlockID(writeChunkRequestProto.getBlockID())
.setChunkData(writeChunkRequestProto.getChunkData());
ContainerCommandRequestProto dataContainerCommandProto =
ContainerCommandRequestProto.newBuilder(requestProto)
.setCmdType(Type.ReadChunk)
.setReadChunk(readChunkRequestProto)
.build();

// read the chunk
ContainerCommandResponseProto response =
dispatchCommand(dataContainerCommandProto);
ReadChunkResponseProto responseProto = response.getReadChunk();

// assert that the response has data in it.
Preconditions.checkNotNull(responseProto.getData());

// reconstruct the write chunk request
final WriteChunkRequestProto.Builder dataWriteChunkProto =
WriteChunkRequestProto.newBuilder(writeChunkRequestProto)
// adding the state machine data
.setData(responseProto.getData())
.setStage(Stage.WRITE_DATA);

ContainerCommandRequestProto.Builder newStateMachineProto =
ContainerCommandRequestProto.newBuilder(requestProto)
.setWriteChunk(dataWriteChunkProto);

// recreate the log entry
final SMLogEntryProto log =
SMLogEntryProto.newBuilder(smLogEntryProto)
.setStateMachineData(newStateMachineProto.build().toByteString())
.build();
return LogEntryProto.newBuilder().setSmLogEntry(log).build();
  }

  /*
   * This api is used by the leader while appending logs to the follower
   * This allows the leader to read the state machine data from the
   * state machine implementation in case cached state machine data has been
   * evicted.
   */
  @Override
  public CompletableFuture readStateMachineData(
  LogEntryProto entry) {
SMLogEntryProto smLogEntryProto = entry.getSmLogEntry();
if (!smLogEntryProto.getStateMachineData().isEmpty()) {
  return CompletableFuture.completedFuture(entry);
}

final CompletableFuture f = new CompletableFuture<>();
try {
  final ContainerCommandRequestProto requestProto =
  getRequestProto(entry.getSmLogEntry().getData());
  // readStateMachineData should only be called for "write" to Ratis.
  Preconditions.checkArgument(!HddsUtils.isReadOnly(requestProto));

  if (requestProto.getCmdType() == Type.WriteChunk) {
return CompletableFuture.supplyAsync(() -> 
readStateMachineData(smLogEntryProto, requestProto), writeChunkExecutor);
  } else if (requestProto.getCmdType() == Type.CreateContainer) {
// recreate the log entry
final SMLogEntryProto log =
SMLogEntryProto.newBuilder(smLogEntryProto)
.setStateMachineData(requestProto.toByteString())
.build();
return CompletableFuture.completedFuture(
LogEntryProto.newBuilder().setSmLogEntry(log).build());
  } else {
throw new IllegalStateException("Cmd type:" + requestProto.getCmdType()
+ " cannot have state machine data");
  }
} catch (Exception e) {
  LOG.error("unable to read stateMachineData:" + e);
  return completeExceptionally(e);
}
  }
{code}


> ContainerStateMachine should provide readStateMachineData api to read data if 
> Containers with required during replication
> -
>
> Key: HDDS-230
> URL: https://issues.apache.org/jira/browse/HDDS-230
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Critical
> Fix For: 0.2.1
>
> Attachments: HDDS-230.001.patch, HDDS-230.002.patch, 
> HDDS-230.003.patch, HDDS-230.

[jira] [Commented] (HDFS-13767) Add msync server implementation.

2018-08-03 Thread Chen Liang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568843#comment-16568843
 ] 

Chen Liang commented on HDFS-13767:
---

Thanks for comments [~zero45].

bq. (2) A concern about the unit test...
Thanks for trying this. I think it might be related to the recent update that 
enabled in-progress edit tail. Because if I disable in-progress edit tail for 
this test, then add the delay you mentioned, the test will succeed, but took 
longer (expected). I will need to look deeper to see how in-progress edit tail 
worked with this. [~xkrogen] any fast insights on this?

bq. (3) So far these patches don't actually ...
The only reason there is msync in {{NameNodeRpcServer}} is because msync is 
part of {{ClientProtocol}}, and NameNodeRpcServer implements it. So changing 
this in NameNodeRpcServer won't compile. I guess msync is a little bit 
different from other RPC calls, in that from client side, it's another RPC call 
like others, but from server side, all it's logic is handled in RPC layer, not 
in NN layer at all.

bq. (4) I was curious about making msync pause ...
Just some thoughts of mine: it is true that there will be no deferred call in 
this case, but seems to me this will also add retry msync calls, so I don't see 
a particular advantage here in terms of call queue load. Actually it maybe even 
worse: if client times out and retries, the previous call could still be in the 
server call queue? In which case we could have multiple msync call in the 
server call queue, many clients doing this msync retry could fill up the queue 
unnecessarily.  While if the client retry interval is long, it can easily be 
just wasting time. 

> Add msync server implementation.
> 
>
> Key: HDFS-13767
> URL: https://issues.apache.org/jira/browse/HDFS-13767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13767-HDFS-12943.001.patch, 
> HDFS-13767-HDFS-12943.002.patch, HDFS-13767.WIP.001.patch, 
> HDFS-13767.WIP.002.patch, HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-03 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568829#comment-16568829
 ] 

genericqa commented on HDFS-13792:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
37m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  5s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13792 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934328/HDFS-13792.000.patch |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 587e8cc3be4f 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2b18bb4 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 329 (vs. ulimit of 1) |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24697/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fix FSN read/write lock metrics name
> 
>
> Key: HDFS-13792
> URL: https://issues.apache.org/jira/browse/HDFS-13792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Trivial
> Attachments: HDFS-13792.000.patch
>
>
> The metrics name for FSN read/write lock should be in the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NanosNumOps
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-03 Thread Chao Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-13792:

Description: 
The metrics name for FSN read/write lock should be in the format:

{code}
FSN(Read|Write)Lock`*OperationName*`NanosNumOps
{code}

not
{code}
FSN(Read|Write)Lock`*OperationName*`NumOps
{code}


  was:
The metrics name for FSN read/write lock should be in the format:

{code}
FSN(Read|Write)Lock`*OperationName*`NaNosNumOps
{code}

not
{code}
FSN(Read|Write)Lock`*OperationName*`NumOps
{code}



> Fix FSN read/write lock metrics name
> 
>
> Key: HDFS-13792
> URL: https://issues.apache.org/jira/browse/HDFS-13792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Trivial
> Attachments: HDFS-13792.000.patch
>
>
> The metrics name for FSN read/write lock should be in the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NanosNumOps
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-03 Thread Chao Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-13792:

Attachment: (was: HDFS-13792.000.patch)

> Fix FSN read/write lock metrics name
> 
>
> Key: HDFS-13792
> URL: https://issues.apache.org/jira/browse/HDFS-13792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Trivial
> Attachments: HDFS-13792.000.patch
>
>
> The metrics name for FSN read/write lock should be in the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NaNosNumOps
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-03 Thread Chao Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-13792:

Attachment: HDFS-13792.000.patch

> Fix FSN read/write lock metrics name
> 
>
> Key: HDFS-13792
> URL: https://issues.apache.org/jira/browse/HDFS-13792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Trivial
> Attachments: HDFS-13792.000.patch
>
>
> The metrics name for FSN read/write lock should be in the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NaNosNumOps
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-13749) Implement a new client protocol method to get NameNode state

2018-08-03 Thread Plamen Jeliazkov (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568813#comment-16568813
 ] 

Plamen Jeliazkov edited comment on HDFS-13749 at 8/3/18 9:34 PM:
-

I think eliminating superuser privileges from _getServiceStatus_ actually makes 
sense. [~shv], thoughts?

My reasoning:
(1) - _NameNodeRpcServer.setSafeMode_ only requires superuser privileges for 
setting SafeMode. Getting SafeMode status does not require superuser.
(2) - _DFSAdmin.report_ does not require superuser privileges despite being an 
"*Admin" class.
(3) - I don't see any harm in exposing the NameNode's service status to a 
non-superuser.
(4) - _setServiceStatus_ seems like the exact call we would want to use for 
this. It would essentially be un-necessary / code duplication to use anything 
else.


was (Author: zero45):
I think eliminating superuser privileges from _getServiceStatus_ actually makes 
sense. [~shv], thoughts?

My reasoning:
(1) - _NameNodeRpcServer.setSafeMode_ only requires superuser privileges for 
setting SafeMode. Getting SafeMode status does not require superuser.
(2) - _DFSAdmin.report_ does not require superuser privileges despite being an 
"*Admin" class.
(3) - I don't see any harm in exposing the NameNode's service status to a 
non-superuser.

> Implement a new client protocol method to get NameNode state
> 
>
> Key: HDFS-13749
> URL: https://issues.apache.org/jira/browse/HDFS-13749
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13749-HDFS-12943.000.patch
>
>
> Currently {{HAServiceProtocol#getServiceStatus}} requires super user 
> privilege. Therefore, as a temporary solution, in HDFS-12976 we discover 
> NameNode state by calling {{reportBadBlocks}}. Here, we'll properly implement 
> this by adding a new method in client protocol to get the NameNode state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-230) ContainerStateMachine should provide readStateMachineData api to read data if Containers with required during replication

2018-08-03 Thread Tsz Wo Nicholas Sze (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568814#comment-16568814
 ] 

Tsz Wo Nicholas Sze commented on HDDS-230:
--

>  I feel that we should reconstruct LogEntryProto as this is already done as 
>part of startTransaction in ContainerStateMachine. ...

Sure.  We may do the refactoring later.

> ContainerStateMachine should provide readStateMachineData api to read data if 
> Containers with required during replication
> -
>
> Key: HDDS-230
> URL: https://issues.apache.org/jira/browse/HDDS-230
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Critical
> Fix For: 0.2.1
>
> Attachments: HDDS-230.001.patch, HDDS-230.002.patch, 
> HDDS-230.003.patch, HDDS-230.004.patch
>
>
> Ozone datanode exits during data write with the following exception.
> {code}
> 2018-07-05 14:10:01,605 INFO org.apache.ratis.server.storage.RaftLogWorker: 
> Rolling segment:40356aa1-741f-499c-aad1-b500f2620a3d_9858-RaftLogWorker index 
> to:4565
> 2018-07-05 14:10:01,607 ERROR 
> org.apache.ratis.server.impl.StateMachineUpdater: Terminating with exit 
> status 2: StateMachineUpdater-40356aa1-741f-499c-aad1-b500f2620a3d_9858: the 
> StateMachineUpdater hits Throwable
> java.lang.NullPointerException
> at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.applyTransaction(ContainerStateMachine.java:272)
> at 
> org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1058)
> at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:154)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> This might be as a result of a ratis transaction which was not written 
> through the "writeStateMachineData" phase, however it was added to the raft 
> log. This implied that stateMachineUpdater now applies a transaction without 
> the corresponding entry being added to the stateMachine.
> I am raising this jira to track the issue and will also raise a Ratis jira if 
> required.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13749) Implement a new client protocol method to get NameNode state

2018-08-03 Thread Plamen Jeliazkov (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568813#comment-16568813
 ] 

Plamen Jeliazkov commented on HDFS-13749:
-

I think eliminating superuser privileges from _getServiceStatus_ actually makes 
sense. [~shv], thoughts?

My reasoning:
(1) - _NameNodeRpcServer.setSafeMode_ only requires superuser privileges for 
setting SafeMode. Getting SafeMode status does not require superuser.
(2) - _DFSAdmin.report_ does not require superuser privileges despite being an 
"*Admin" class.
(3) - I don't see any harm in exposing the NameNode's service status to a 
non-superuser.

> Implement a new client protocol method to get NameNode state
> 
>
> Key: HDFS-13749
> URL: https://issues.apache.org/jira/browse/HDFS-13749
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13749-HDFS-12943.000.patch
>
>
> Currently {{HAServiceProtocol#getServiceStatus}} requires super user 
> privilege. Therefore, as a temporary solution, in HDFS-12976 we discover 
> NameNode state by calling {{reportBadBlocks}}. Here, we'll properly implement 
> this by adding a new method in client protocol to get the NameNode state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-13767) Add msync server implementation.

2018-08-03 Thread Plamen Jeliazkov (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568674#comment-16568674
 ] 

Plamen Jeliazkov edited comment on HDFS-13767 at 8/3/18 9:22 PM:
-

Hey Chen,

Change looks pretty good so far; couple things.

(1) I like the idea of splitting AlignmentContext into client and server 
interfaces. I think originally Konstantin wanted just a single interface but I 
think its clear there is a difference. I would not mind picking up that JIRA if 
we want to go ahead with it. This would let us allow the server-side version of 
AlignmentContext to better handle defering / re-queue'ing calls.

(2) A concern about the unit test -- I tried to add a 10 second sleep between 
the thread start and the assert:
{code:java}
assertFalse(readSucceed.get());
{code}
However I found that the test would fail if I did. Which makes me question the 
unit test.
I would expect the 'getFileStatus' to basically hang until the 
_rollEditLogAndTail_ call but seems it is updating anyway.
I tried also stopping the _EditLogTailer_ on the ObserverNode but it still 
updated anyway.
Please let me know if I am missing something about this test; I will look 
further.

(3) So far these patches don't actually implement _NameNodeRpcServer.msync()_. 
Should we re-title the JIRA?

(4) [~xkrogen], [~shv], and others -- I was curious about making msync pause at 
the server level. What if we had msync hang at the client side instead? This 
way we aren't impacting Observer queues with many deferred calls. We could 
simply have the client check up with the Observers' states and so rather than 
re-queue'ing we throw a StandbyException (or something) and force the client to 
maybe pause and retry the msync call on the same or a different Observer. If 
the msync call succeeds then we know we can use the particular Observer that it 
succeeded on. Thoughts?




was (Author: zero45):
Hey Chen,

Change looks pretty good so far; couple things.

(1) I like the idea of splitting AlignmentContext into client and server 
interfaces. I think originally Konstantin wanted just a single interface but I 
think its clear there is a difference. I would not mind picking up that JIRA if 
we want to go ahead with it. This would let us allow the server-side version of 
AlignmentContext to better handle defering / re-queue'ing calls.

(2) A concern about the unit test -- I tried to add a 10 second sleep between 
the thread start and the assert:
{code:java}
assertFalse(readSucceed.get());
{code}
However I found that the test would fail if I did. Which makes me question the 
unit test.
I would expect the 'getFileStatus' to basically hang until the 
_rollEditLogAndTail_ call but seems it is updating anyway.
I tried also stopping the _EditLogTailer_ on the ObserverNode but it still 
updated anyway.
Please let me know if I am missing something about this test; I will look 
further.

(3) So far these patches don't actually implement _NameNodeRpcServer.msync()_. 
Should we rename it?

(4) [~xkrogen], [~shv], and others -- I was curious about making msync pause at 
the server level. What if we had msync hang at the client side instead? This 
way we aren't impacting Observer queues with many deferred calls. We could 
simply have the client check up with the Observers' states and so rather than 
re-queue'ing we throw a StandbyException (or something) and force the client to 
maybe pause and retry the msync call on the same or a different Observer. If 
the msync call succeeds then we know we can use the particular Observer that it 
succeeded on. Thoughts?



> Add msync server implementation.
> 
>
> Key: HDFS-13767
> URL: https://issues.apache.org/jira/browse/HDFS-13767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13767-HDFS-12943.001.patch, 
> HDFS-13767-HDFS-12943.002.patch, HDFS-13767.WIP.001.patch, 
> HDFS-13767.WIP.002.patch, HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13749) Implement a new client protocol method to get NameNode state

2018-08-03 Thread Chao Sun (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568761#comment-16568761
 ] 

Chao Sun commented on HDFS-13749:
-

[~zero45]: the restriction came from HDFS-2917, which prevents regular user to 
run {{haadmin}}. Yes, good idea on reusing {{NameNode.getServiceState()}}. Will 
change.

> Implement a new client protocol method to get NameNode state
> 
>
> Key: HDFS-13749
> URL: https://issues.apache.org/jira/browse/HDFS-13749
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13749-HDFS-12943.000.patch
>
>
> Currently {{HAServiceProtocol#getServiceStatus}} requires super user 
> privilege. Therefore, as a temporary solution, in HDFS-12976 we discover 
> NameNode state by calling {{reportBadBlocks}}. Here, we'll properly implement 
> this by adding a new method in client protocol to get the NameNode state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work started] (HDFS-13779) Implement performFailover logic for ObserverReadProxyProvider.

2018-08-03 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-13779 started by Erik Krogen.
--
> Implement performFailover logic for ObserverReadProxyProvider.
> --
>
> Key: HDFS-13779
> URL: https://issues.apache.org/jira/browse/HDFS-13779
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Konstantin Shvachko
>Assignee: Erik Krogen
>Priority: Major
>
> Currently {{ObserverReadProxyProvider}} inherits {{performFailover()}} method 
> from {{ConfiguredFailoverProxyProvider}}, which simply increments the index 
> and switches over to another NameNode. The logic for ORPP should be smart 
> enough to choose another observer, otherwise it can switch to a SBN, where 
> reads are disallowed, or to an ANN, which defeats the purpose of reads from 
> standby.
> This was discussed in HDFS-12976.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-03 Thread Chao Sun (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568741#comment-16568741
 ] 

Chao Sun commented on HDFS-13792:
-

[~linyiqun]: can you take a look at this patch? Thanks.

> Fix FSN read/write lock metrics name
> 
>
> Key: HDFS-13792
> URL: https://issues.apache.org/jira/browse/HDFS-13792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Trivial
> Attachments: HDFS-13792.000.patch
>
>
> The metrics name for FSN read/write lock should be in the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NaNosNumOps
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-03 Thread Chao Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-13792:

Attachment: HDFS-13792.000.patch

> Fix FSN read/write lock metrics name
> 
>
> Key: HDFS-13792
> URL: https://issues.apache.org/jira/browse/HDFS-13792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Trivial
> Attachments: HDFS-13792.000.patch
>
>
> The metrics name for FSN read/write lock should be in the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NaNosNumOps
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-03 Thread Chao Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-13792:

Status: Patch Available  (was: Open)

> Fix FSN read/write lock metrics name
> 
>
> Key: HDFS-13792
> URL: https://issues.apache.org/jira/browse/HDFS-13792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Trivial
> Attachments: HDFS-13792.000.patch
>
>
> The metrics name for FSN read/write lock should be in the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NaNosNumOps
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-03 Thread Chao Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-13792:

Description: 
The metrics name for FSN read/write lock should be in the format:

{code}
FSN(Read|Write)Lock`*OperationName*`NaNosNumOps
{code}

not
{code}
FSN(Read|Write)Lock`*OperationName*`NumOps
{code}


  was:
The metrics name for FSN read/write lock should be in the format:

{code}
FSN(Read|Write)Lock`*OperationName*`NaNosNumOps{code}
{code}

not
{code}
FSN(Read|Write)Lock`*OperationName*`NumOps{code}
{code}



> Fix FSN read/write lock metrics name
> 
>
> Key: HDFS-13792
> URL: https://issues.apache.org/jira/browse/HDFS-13792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
> Environment: The metrics name for FSN read/write lock should be in 
> the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NaNosNumOps{code}
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps{code}
> {code}
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Trivial
>
> The metrics name for FSN read/write lock should be in the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NaNosNumOps
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-03 Thread Chao Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-13792:

Environment: (was: The metrics name for FSN read/write lock should be 
in the format:

{code}
FSN(Read|Write)Lock`*OperationName*`NaNosNumOps{code}
{code}

not
{code}
FSN(Read|Write)Lock`*OperationName*`NumOps{code}
{code}
)

> Fix FSN read/write lock metrics name
> 
>
> Key: HDFS-13792
> URL: https://issues.apache.org/jira/browse/HDFS-13792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Trivial
>
> The metrics name for FSN read/write lock should be in the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NaNosNumOps
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-03 Thread Chao Sun (JIRA)

Chao Sun created HDFS-13792:
---

 Summary: Fix FSN read/write lock metrics name
 Key: HDFS-13792
 URL: https://issues.apache.org/jira/browse/HDFS-13792
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, metrics
 Environment: The metrics name for FSN read/write lock should be in the 
format:

{code}
FSN(Read|Write)Lock`*OperationName*`NaNosNumOps{code}
{code}

not
{code}
FSN(Read|Write)Lock`*OperationName*`NumOps{code}
{code}

Reporter: Chao Sun
Assignee: Chao Sun






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-03 Thread Chao Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-13792:

Description: 
The metrics name for FSN read/write lock should be in the format:

{code}
FSN(Read|Write)Lock`*OperationName*`NaNosNumOps{code}
{code}

not
{code}
FSN(Read|Write)Lock`*OperationName*`NumOps{code}
{code}


> Fix FSN read/write lock metrics name
> 
>
> Key: HDFS-13792
> URL: https://issues.apache.org/jira/browse/HDFS-13792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
> Environment: The metrics name for FSN read/write lock should be in 
> the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NaNosNumOps{code}
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps{code}
> {code}
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Trivial
>
> The metrics name for FSN read/write lock should be in the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NaNosNumOps{code}
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps{code}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13728) Disk Balancer should not fail if volume usage is greater than capacity

2018-08-03 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568731#comment-16568731
 ] 

genericqa commented on HDFS-13728:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 39s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}141m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.client.impl.TestBlockReaderLocal |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13728 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931374/HDFS-13728.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ef39cbccf3e2 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2b18bb4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24696/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24696/testReport/ |
| Max. process+thread count | 3468 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24

[jira] [Commented] (HDFS-13749) Implement a new client protocol method to get NameNode state

2018-08-03 Thread Plamen Jeliazkov (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568693#comment-16568693
 ] 

Plamen Jeliazkov commented on HDFS-13749:
-

Is there a particular reason that superuser privileges are required for getting 
the service status? 
I can't seem to find one other than that HAAdmin generally requires it.

We could simple re-use _NameNodeRpcServer.getServiceStatus()_ if we can 
eliminate the superuser privilege requirement.
Otherwise, if you had to add a new RPC call, another possibility is to reuse 
_NameNode.getServiceState()_ which is a fast call and does not require 
superuser privileges already.


> Implement a new client protocol method to get NameNode state
> 
>
> Key: HDFS-13749
> URL: https://issues.apache.org/jira/browse/HDFS-13749
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13749-HDFS-12943.000.patch
>
>
> Currently {{HAServiceProtocol#getServiceStatus}} requires super user 
> privilege. Therefore, as a temporary solution, in HDFS-12976 we discover 
> NameNode state by calling {{reportBadBlocks}}. Here, we'll properly implement 
> this by adding a new method in client protocol to get the NameNode state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-13767) Add msync server implementation.

2018-08-03 Thread Plamen Jeliazkov (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568674#comment-16568674
 ] 

Plamen Jeliazkov edited comment on HDFS-13767 at 8/3/18 7:57 PM:
-

Hey Chen,

Change looks pretty good so far; couple things.

(1) I like the idea of splitting AlignmentContext into client and server 
interfaces. I think originally Konstantin wanted just a single interface but I 
think its clear there is a difference. I would not mind picking up that JIRA if 
we want to go ahead with it. This would let us allow the server-side version of 
AlignmentContext to better handle defering / re-queue'ing calls.

(2) A concern about the unit test -- I tried to add a 10 second sleep between 
the thread start and the assert:
{code:java}
assertFalse(readSucceed.get());
{code}
However I found that the test would fail if I did. Which makes me question the 
unit test.
I would expect the 'getFileStatus' to basically hang until the 
_rollEditLogAndTail_ call but seems it is updating anyway.
I tried also stopping the _EditLogTailer_ on the ObserverNode but it still 
updated anyway.
Please let me know if I am missing something about this test; I will look 
further.

(3) So far these patches don't actually implement _NameNodeRpcServer.msync()_. 
Should we rename it?

(4) [~xkrogen], [~shv], and others -- I was curious about making msync pause at 
the server level. What if we had msync hang at the client side instead? This 
way we aren't impacting Observer queues with many deferred calls. We could 
simply have the client check up with the Observers' states and so rather than 
re-queue'ing we throw a StandbyException (or something) and force the client to 
maybe pause and retry the msync call on the same or a different Observer. If 
the msync call succeeds then we know we can use the particular Observer that it 
succeeded on. Thoughts?




was (Author: zero45):
Hey Chen,

Change looks pretty good so far; couple things.

(1) I like the idea of splitting AlignmentContext into client and server 
interfaces. I think originally Konstantin wanted just a single interface but I 
think its clear there is a difference. I would not mind picking up that JIRA if 
we want to go ahead with it. This would let us allow the server-side version of 
AlignmentContext to better handle defering / re-queue'ing calls.

(2) A concern about the unit test -- I tried to add a 10 second sleep between 
the thread start and the assert:
{code:java}
assertFalse(readSucceed.get());
{code}
However I found that the test would fail if I did. Which makes me question the 
unit test.
I would expect the 'getFileStatus' to basically hang until the 
_rollEditLogAndTail_ call but seems it is updating anyway.
I tried also stopping the _EditLogTailer_ on the ObserverNode but it still 
updated anyway.
Please let me know if I am missing something about this test; I will look 
further.

(3) So far these patches don't actually implement _NameNodeRpcServer.msync()_. 
Should we rename it?

(4) [~xkrogen], [~shv], and others -- I was curios about making msync pause at 
the server level. What if we had msync hang at the client side instead? This 
way we aren't impacting Observer queues with many deferred calls. We could 
simply have the client check up with the Observers' states and so rather than 
re-queue'ing we throw a StandbyException (or something) and force the client to 
maybe pause and retry the msync call on the same or a different Observer. If 
the msync call succeeds then we know we can use the particular Observer that it 
succeeded on. Thoughts?



> Add msync server implementation.
> 
>
> Key: HDFS-13767
> URL: https://issues.apache.org/jira/browse/HDFS-13767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13767-HDFS-12943.001.patch, 
> HDFS-13767-HDFS-12943.002.patch, HDFS-13767.WIP.001.patch, 
> HDFS-13767.WIP.002.patch, HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-13767) Add msync server implementation.

2018-08-03 Thread Plamen Jeliazkov (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568674#comment-16568674
 ] 

Plamen Jeliazkov edited comment on HDFS-13767 at 8/3/18 7:56 PM:
-

Hey Chen,

Change looks pretty good so far; couple things.

(1) I like the idea of splitting AlignmentContext into client and server 
interfaces. I think originally Konstantin wanted just a single interface but I 
think its clear there is a difference. I would not mind picking up that JIRA if 
we want to go ahead with it. This would let us allow the server-side version of 
AlignmentContext to better handle defering / re-queue'ing calls.

(2) A concern about the unit test -- I tried to add a 10 second sleep between 
the thread start and the assert:
{code:java}
assertFalse(readSucceed.get());
{code}
However I found that the test would fail if I did. Which makes me question the 
unit test.
I would expect the 'getFileStatus' to basically hang until the 
_rollEditLogAndTail_ call but seems it is updating anyway.
I tried also stopping the _EditLogTailer_ on the ObserverNode but it still 
updated anyway.
Please let me know if I am missing something about this test; I will look 
further.

(3) So far these patches don't actually implement _NameNodeRpcServer.msync()_. 
Should we rename it?

(4) [~xkrogen], [~shv], and others -- I was curios about making msync pause at 
the server level. What if we had msync hang at the client side instead? This 
way we aren't impacting Observer queues with many deferred calls. We could 
simply have the client check up with the Observers' states and so rather than 
re-queue'ing we throw a StandbyException (or something) and force the client to 
maybe pause and retry the msync call on the same or a different Observer. If 
the msync call succeeds then we know we can use the particular Observer that it 
succeeded on. Thoughts?




was (Author: zero45):
Hey Chen,

Change looks pretty good so far; two things.

(1) I like the idea of splitting AlignmentContext into client and server 
interfaces. I think originally Konstantin wanted just a single interface but I 
think its clear there is a difference. I would not mind picking up that JIRA if 
we want to go ahead with it. This would let us allow the server-side version of 
AlignmentContext to better handle defering / re-queue'ing calls.

(2) A concern about the unit test -- I tried to add a 10 second sleep between 
the thread start and the assert:
{code:java}
assertFalse(readSucceed.get());
{code}
However I found that the test would fail if I did. Which makes me question the 
unit test.
I would expect the 'getFileStatus' to basically hang until the 
_rollEditLogAndTail_ call but seems it is updating anyway.
I tried also stopping the _EditLogTailer_ on the ObserverNode but it still 
updated anyway.
Please let me know if I am missing something about this test; I will look 
further.



> Add msync server implementation.
> 
>
> Key: HDFS-13767
> URL: https://issues.apache.org/jira/browse/HDFS-13767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13767-HDFS-12943.001.patch, 
> HDFS-13767-HDFS-12943.002.patch, HDFS-13767.WIP.001.patch, 
> HDFS-13767.WIP.002.patch, HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-13767) Add msync server implementation.

2018-08-03 Thread Plamen Jeliazkov (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568674#comment-16568674
 ] 

Plamen Jeliazkov edited comment on HDFS-13767 at 8/3/18 7:44 PM:
-

Hey Chen,

Change looks pretty good so far; two things.

(1) I like the idea of splitting AlignmentContext into client and server 
interfaces. I think originally Konstantin wanted just a single interface but I 
think its clear there is a difference. I would not mind picking up that JIRA if 
we want to go ahead with it. This would let us allow the server-side version of 
AlignmentContext to better handle defering / re-queue'ing calls.

(2) A concern about the unit test -- I tried to add a 10 second sleep between 
the thread start and the assert:
{code:java}
assertFalse(readSucceed.get());
{code}
However I found that the test would fail if I did. Which makes me question the 
unit test.
I would expect the 'getFileStatus' to basically hang until the 
_rollEditLogAndTail_ call but seems it is updating anyway.
I tried also stopping the _EditLogTailer_ on the ObserverNode but it still 
updated anyway.
Please let me know if I am missing something about this test; I will look 
further.




was (Author: zero45):
Hey Chen,

Change looks pretty good so far; two things.

(1) I like the idea of splitting AlignmentContext into client and server 
interfaces. I think originally Konstantin wanted just a single interface but I 
think its clear there is a difference. I would not mind picking up that JIRA if 
we want to go ahead with it. This would let us allow the server-side version of 
AlignmentContext to better handle defering / re-queue'ing calls.

(2) A concern about the unit test -- I tried to add a 10 second sleep between 
the thread start and the assert:
{code:java}
assertFalse(readSucceed.get());
{code}
However I found that the test would fail if I did. Which makes me question the 
unit test.
I would expect the 'getFileStatus' to basically hang until the 
_rollEditLogAndTail _call but seems it is updating anyway.
I tried also stopping the _EditLogTailer_ on the ObserverNode but it still 
updated anyway.
Please let me know if I am missing something about this test; I will look 
further.



> Add msync server implementation.
> 
>
> Key: HDFS-13767
> URL: https://issues.apache.org/jira/browse/HDFS-13767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13767-HDFS-12943.001.patch, 
> HDFS-13767-HDFS-12943.002.patch, HDFS-13767.WIP.001.patch, 
> HDFS-13767.WIP.002.patch, HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-13767) Add msync server implementation.

2018-08-03 Thread Plamen Jeliazkov (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568674#comment-16568674
 ] 

Plamen Jeliazkov edited comment on HDFS-13767 at 8/3/18 7:44 PM:
-

Hey Chen,

Change looks pretty good so far; two things.

(1) I like the idea of splitting AlignmentContext into client and server 
interfaces. I think originally Konstantin wanted just a single interface but I 
think its clear there is a difference. I would not mind picking up that JIRA if 
we want to go ahead with it. This would let us allow the server-side version of 
AlignmentContext to better handle defering / re-queue'ing calls.

(2) A concern about the unit test -- I tried to add a 10 second sleep between 
the thread start and the assert:
{code:java}
assertFalse(readSucceed.get());
{code}
However I found that the test would fail if I did. Which makes me question the 
unit test.
I would expect the 'getFileStatus' to basically hang until the 
_rollEditLogAndTail _call but seems it is updating anyway.
I tried also stopping the _EditLogTailer_ on the ObserverNode but it still 
updated anyway.
Please let me know if I am missing something about this test; I will look 
further.




was (Author: zero45):
Hey Chen,

Change looks pretty good so far; two things.

(1) I like the idea of splitting AlignmentContext into client and server 
interfaces. I think originally Konstantin wanted just a single interface but I 
think its clear there is a difference. I would not mind picking up that JIRA if 
we want to go ahead with it. This would let us allow the server-side version of 
AlignmentContext to better handle defering / re-queue'ing calls.

(2) A concern about the unit test -- I tried to add a 10 second sleep between 
the thread start and the assert:
{code:java}
assertFalse(readSucceed.get());
{code}
However I found that the test would fail if I did. Which makes me question the 
unit test.
I would expect the 'getFileStatus' to basically hang until the 
{{{rollEditLogAndTail}}} call but seems it is updating anyway.
I tried also stopping the {{{EditLogTailer}}} on the ObserverNode but it still 
updated anyway.
Please let me know if I am missing something about this test; I will look 
further.



> Add msync server implementation.
> 
>
> Key: HDFS-13767
> URL: https://issues.apache.org/jira/browse/HDFS-13767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13767-HDFS-12943.001.patch, 
> HDFS-13767-HDFS-12943.002.patch, HDFS-13767.WIP.001.patch, 
> HDFS-13767.WIP.002.patch, HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13767) Add msync server implementation.

2018-08-03 Thread Plamen Jeliazkov (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568674#comment-16568674
 ] 

Plamen Jeliazkov commented on HDFS-13767:
-

Hey Chen,

Change looks pretty good so far; two things.

(1) I like the idea of splitting AlignmentContext into client and server 
interfaces. I think originally Konstantin wanted just a single interface but I 
think its clear there is a difference. I would not mind picking up that JIRA if 
we want to go ahead with it. This would let us allow the server-side version of 
AlignmentContext to better handle defering / re-queue'ing calls.

(2) A concern about the unit test -- I tried to add a 10 second sleep between 
the thread start and the assert:
{code:java}
assertFalse(readSucceed.get());
{code}
However I found that the test would fail if I did. Which makes me question the 
unit test.
I would expect the 'getFileStatus' to basically hang until the 
{{{rollEditLogAndTail}}} call but seems it is updating anyway.
I tried also stopping the {{{EditLogTailer}}} on the ObserverNode but it still 
updated anyway.
Please let me know if I am missing something about this test; I will look 
further.



> Add msync server implementation.
> 
>
> Key: HDFS-13767
> URL: https://issues.apache.org/jira/browse/HDFS-13767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13767-HDFS-12943.001.patch, 
> HDFS-13767-HDFS-12943.002.patch, HDFS-13767.WIP.001.patch, 
> HDFS-13767.WIP.002.patch, HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-265) Move numPendingDeletionBlocks and deleteTransactionId from ContainerData to KeyValueContainerData

2018-08-03 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568659#comment-16568659
 ] 

genericqa commented on HDDS-265:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 31m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 31m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 31m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 19s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
56s{color} | {color:green} container-service in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  6m 31s{color} 
| {color:red} integration-test in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}136m 55s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.freon.TestDataValidate |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDDS-265 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934218/HDDS-265.000.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 7235d3e4ae49 3.13.0-153-generic #203-Ubuntu SMP Thu

[jira] [Commented] (HDDS-268) Add SCM close container watcher

2018-08-03 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568629#comment-16568629
 ] 

genericqa commented on HDDS-268:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
33s{color} | {color:red} hadoop-hdds/server-scm in trunk has 1 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 37s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
26s{color} | {color:green} framework in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
22s{color} | {color:green} server-scm in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDDS-268 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934298/HDDS-268.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 0988154da5b4 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2b18bb4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDD

[jira] [Commented] (HDFS-13738) fsck -list-corruptfileblocks has infinite loop if user is not privileged.

2018-08-03 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568619#comment-16568619
 ] 

genericqa commented on HDFS-13738:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  8s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 75m 43s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}139m 21s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13738 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934287/HDFS-13738.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f2843191adb4 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2b18bb4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24695/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24695/testReport/ |
| Max. process+thread count | 3522 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24695/console |
| Powered by | Apache Yetu

[jira] [Assigned] (HDFS-13780) Postpone NameNode state discovery in ObserverReadProxyProvider until the first real RPC call.

2018-08-03 Thread Chen Liang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang reassigned HDFS-13780:
-

Assignee: Chen Liang

> Postpone NameNode state discovery in ObserverReadProxyProvider until the 
> first real RPC call.
> -
>
> Key: HDFS-13780
> URL: https://issues.apache.org/jira/browse/HDFS-13780
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
>
> Currently {{ObserverReadProxyProvider}} during instantiation discovers 
> Observers by poking known NameNodes and checking their states. This rather 
> expensive process can be postponed until the first actual RPC call.
> This is an optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13789) Reduce logging frequency of QuorumJournalManager#selectInputStreams

2018-08-03 Thread Chao Sun (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568598#comment-16568598
 ] 

Chao Sun commented on HDFS-13789:
-

Sounds good [~xkrogen]. +1 on this JIRA.

> Reduce logging frequency of QuorumJournalManager#selectInputStreams
> ---
>
> Key: HDFS-13789
> URL: https://issues.apache.org/jira/browse/HDFS-13789
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, qjm
>Affects Versions: HDFS-12943
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Trivial
> Attachments: HDFS-13789-HDFS-12943.000.patch
>
>
> As part of HDFS-13150, a logging statement was added to indicate whenever an 
> edit tail is performed via the RPC mechanism. To enable low latency tailing, 
> the tail frequency must be set very low, so this log statement gets printed 
> much too frequently at an INFO level. We should decrease to DEBUG. Note that 
> if there are actually edits available to tail, other log messages will get 
> printed; this is just targeting the case when it attempts to tail and there 
> are no new edits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13789) Reduce logging frequency of QuorumJournalManager#selectInputStreams

2018-08-03 Thread Erik Krogen (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568589#comment-16568589
 ] 

Erik Krogen commented on HDFS-13789:


Hey [~csun], thanks for taking a look. Yes, I agree we should do so, but this 
will be a bit more complex work. I created HDFS-13791 to track this, in this 
JIRA can we just focus on this log statement which is the most egregious 
offender?

> Reduce logging frequency of QuorumJournalManager#selectInputStreams
> ---
>
> Key: HDFS-13789
> URL: https://issues.apache.org/jira/browse/HDFS-13789
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, qjm
>Affects Versions: HDFS-12943
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Trivial
> Attachments: HDFS-13789-HDFS-12943.000.patch
>
>
> As part of HDFS-13150, a logging statement was added to indicate whenever an 
> edit tail is performed via the RPC mechanism. To enable low latency tailing, 
> the tail frequency must be set very low, so this log statement gets printed 
> much too frequently at an INFO level. We should decrease to DEBUG. Note that 
> if there are actually edits available to tail, other log messages will get 
> printed; this is just targeting the case when it attempts to tail and there 
> are no new edits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-13791) Limit logging frequency of edit tail related statements

2018-08-03 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13791:
--

 Summary: Limit logging frequency of edit tail related statements
 Key: HDFS-13791
 URL: https://issues.apache.org/jira/browse/HDFS-13791
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs, qjm
Reporter: Erik Krogen
Assignee: Erik Krogen


There are a number of log statements that occur every time new edits are tailed 
by a Standby NameNode. When edits are tailing only on the order of every tens 
of seconds, this is fine. With the work in HDFS-13150, however, edits may be 
tailed every few milliseconds, which can flood the logs with tailing-related 
statements. We should throttle it to limit it to printing at most, say, once 
per 5 seconds.

We can implement logic similar to that used in HDFS-10713. This may be slightly 
more tricky since the log statements are distributed across a few classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-323) Rename Storage Containers

2018-08-03 Thread Arpit Agarwal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reassigned HDDS-323:
--

Assignee: Arpit Agarwal

> Rename Storage Containers
> -
>
> Key: HDDS-323
> URL: https://issues.apache.org/jira/browse/HDDS-323
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>Priority: Major
>
> The term container is heavily overloaded and easy to confuse with yarn/Linux 
> containers.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-323) Rename Storage Containers

2018-08-03 Thread Arpit Agarwal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reassigned HDDS-323:
--

Assignee: (was: Arpit Agarwal)

> Rename Storage Containers
> -
>
> Key: HDDS-323
> URL: https://issues.apache.org/jira/browse/HDDS-323
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Arpit Agarwal
>Priority: Major
>
> The term container is heavily overloaded and easy to confuse with yarn/Linux 
> containers.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-323) Rename Storage Containers

2018-08-03 Thread Arpit Agarwal (JIRA)

Arpit Agarwal created HDDS-323:
--

 Summary: Rename Storage Containers
 Key: HDDS-323
 URL: https://issues.apache.org/jira/browse/HDDS-323
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Arpit Agarwal


The term container is heavily overloaded and easy to confuse with yarn/Linux 
containers.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13782) ObserverReadProxyProvider should work with IPFailoverProxyProvider

2018-08-03 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-13782:
---
Summary: ObserverReadProxyProvider should work with IPFailoverProxyProvider 
 (was: IPFailoverProxyProvider should work with IPFailoverProxyProvider)

> ObserverReadProxyProvider should work with IPFailoverProxyProvider
> --
>
> Key: HDFS-13782
> URL: https://issues.apache.org/jira/browse/HDFS-13782
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Konstantin Shvachko
>Priority: Major
>
> Currently {{ObserverReadProxyProvider}} is based on 
> {{ConfiguredFailoverProxyProvider}}. We should also be able perform SBN reads 
> in case of {{IPFailoverProxyProvider}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13782) IPFailoverProxyProvider should work with IPFailoverProxyProvider

2018-08-03 Thread Konstantin Shvachko (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-13782:
---
Summary: IPFailoverProxyProvider should work with IPFailoverProxyProvider  
(was: IPFailoverProxyProvider should work with SBN)

> IPFailoverProxyProvider should work with IPFailoverProxyProvider
> 
>
> Key: HDFS-13782
> URL: https://issues.apache.org/jira/browse/HDFS-13782
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Konstantin Shvachko
>Priority: Major
>
> Currently {{ObserverReadProxyProvider}} is based on 
> {{ConfiguredFailoverProxyProvider}}. We should also be able perform SBN reads 
> in case of {{IPFailoverProxyProvider}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-13779) Implement performFailover logic for ObserverReadProxyProvider.

2018-08-03 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reassigned HDFS-13779:
--

Assignee: Erik Krogen

> Implement performFailover logic for ObserverReadProxyProvider.
> --
>
> Key: HDFS-13779
> URL: https://issues.apache.org/jira/browse/HDFS-13779
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Konstantin Shvachko
>Assignee: Erik Krogen
>Priority: Major
>
> Currently {{ObserverReadProxyProvider}} inherits {{performFailover()}} method 
> from {{ConfiguredFailoverProxyProvider}}, which simply increments the index 
> and switches over to another NameNode. The logic for ORPP should be smart 
> enough to choose another observer, otherwise it can switch to a SBN, where 
> reads are disallowed, or to an ANN, which defeats the purpose of reads from 
> standby.
> This was discussed in HDFS-12976.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13728) Disk Balancer should not fail if volume usage is greater than capacity

2018-08-03 Thread Stephen O'Donnell (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568559#comment-16568559
 ] 

Stephen O'Donnell commented on HDFS-13728:
--

[~gabor.bota] I have assigned this one to me and submitted the patch. Please 
give it a review if you have time and we can see what comes back from the patch 
submission.

> Disk Balancer should not fail if volume usage is greater than capacity
> --
>
> Key: HDFS-13728
> URL: https://issues.apache.org/jira/browse/HDFS-13728
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: diskbalancer
>Affects Versions: 3.0.3
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Minor
> Attachments: HDFS-13728.001.patch
>
>
> We have seen a couple of scenarios where the disk balancer fails because a 
> datanode reports more spaced used on a disk than its capacity, which should 
> not be possible.
> This is due to the check below in DiskBalancerVolume.java:
> {code}
>   public void setUsed(long dfsUsedSpace) {
> Preconditions.checkArgument(dfsUsedSpace < this.getCapacity(),
> "DiskBalancerVolume.setUsed: dfsUsedSpace(%s) < capacity(%s)",
> dfsUsedSpace, getCapacity());
> this.used = dfsUsedSpace;
>   }
> {code}
> While I agree that it should not be possible for a DN to report more usage on 
> a volume than its capacity, there seems to be some issue that causes this to 
> occur sometimes.
> In general, this full disk is what causes someone to want to run the Disk 
> Balancer, only to find it fails with the error.
> There appears to be nothing you can do to force the Disk Balancer to run at 
> this point, but in the scenarios I saw, some data was removed from the disk 
> and usage dropped below the capacity resolving the issue.
> Can we considered relaxing the above check, and if the usage is greater than 
> the capacity, just set the usage to the capacity so the calculations all work 
> ok?
> Eg something like this:
> {code}
>public void setUsed(long dfsUsedSpace) {
> -Preconditions.checkArgument(dfsUsedSpace < this.getCapacity());
> -this.used = dfsUsedSpace;
> +if (dfsUsedSpace > this.getCapacity()) {
> +  this.used = this.getCapacity();
> +} else {
> +  this.used = dfsUsedSpace;
> +}
>}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13779) Implement performFailover logic for ObserverReadProxyProvider.

2018-08-03 Thread Konstantin Shvachko (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568558#comment-16568558
 ] 

Konstantin Shvachko commented on HDFS-13779:


Also reuse one of the existing retry policies for ORPP.

> Implement performFailover logic for ObserverReadProxyProvider.
> --
>
> Key: HDFS-13779
> URL: https://issues.apache.org/jira/browse/HDFS-13779
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Konstantin Shvachko
>Priority: Major
>
> Currently {{ObserverReadProxyProvider}} inherits {{performFailover()}} method 
> from {{ConfiguredFailoverProxyProvider}}, which simply increments the index 
> and switches over to another NameNode. The logic for ORPP should be smart 
> enough to choose another observer, otherwise it can switch to a SBN, where 
> reads are disallowed, or to an ANN, which defeats the purpose of reads from 
> standby.
> This was discussed in HDFS-12976.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13728) Disk Balancer should not fail if volume usage is greater than capacity

2018-08-03 Thread Stephen O'Donnell (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-13728:
-
Status: Patch Available  (was: Open)

> Disk Balancer should not fail if volume usage is greater than capacity
> --
>
> Key: HDFS-13728
> URL: https://issues.apache.org/jira/browse/HDFS-13728
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: diskbalancer
>Affects Versions: 3.0.3
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Minor
> Attachments: HDFS-13728.001.patch
>
>
> We have seen a couple of scenarios where the disk balancer fails because a 
> datanode reports more spaced used on a disk than its capacity, which should 
> not be possible.
> This is due to the check below in DiskBalancerVolume.java:
> {code}
>   public void setUsed(long dfsUsedSpace) {
> Preconditions.checkArgument(dfsUsedSpace < this.getCapacity(),
> "DiskBalancerVolume.setUsed: dfsUsedSpace(%s) < capacity(%s)",
> dfsUsedSpace, getCapacity());
> this.used = dfsUsedSpace;
>   }
> {code}
> While I agree that it should not be possible for a DN to report more usage on 
> a volume than its capacity, there seems to be some issue that causes this to 
> occur sometimes.
> In general, this full disk is what causes someone to want to run the Disk 
> Balancer, only to find it fails with the error.
> There appears to be nothing you can do to force the Disk Balancer to run at 
> this point, but in the scenarios I saw, some data was removed from the disk 
> and usage dropped below the capacity resolving the issue.
> Can we considered relaxing the above check, and if the usage is greater than 
> the capacity, just set the usage to the capacity so the calculations all work 
> ok?
> Eg something like this:
> {code}
>public void setUsed(long dfsUsedSpace) {
> -Preconditions.checkArgument(dfsUsedSpace < this.getCapacity());
> -this.used = dfsUsedSpace;
> +if (dfsUsedSpace > this.getCapacity()) {
> +  this.used = this.getCapacity();
> +} else {
> +  this.used = dfsUsedSpace;
> +}
>}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-13728) Disk Balancer should not fail if volume usage is greater than capacity

2018-08-03 Thread Stephen O'Donnell (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell reassigned HDFS-13728:


Assignee: Stephen O'Donnell  (was: Gabor Bota)

> Disk Balancer should not fail if volume usage is greater than capacity
> --
>
> Key: HDFS-13728
> URL: https://issues.apache.org/jira/browse/HDFS-13728
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: diskbalancer
>Affects Versions: 3.0.3
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Minor
> Attachments: HDFS-13728.001.patch
>
>
> We have seen a couple of scenarios where the disk balancer fails because a 
> datanode reports more spaced used on a disk than its capacity, which should 
> not be possible.
> This is due to the check below in DiskBalancerVolume.java:
> {code}
>   public void setUsed(long dfsUsedSpace) {
> Preconditions.checkArgument(dfsUsedSpace < this.getCapacity(),
> "DiskBalancerVolume.setUsed: dfsUsedSpace(%s) < capacity(%s)",
> dfsUsedSpace, getCapacity());
> this.used = dfsUsedSpace;
>   }
> {code}
> While I agree that it should not be possible for a DN to report more usage on 
> a volume than its capacity, there seems to be some issue that causes this to 
> occur sometimes.
> In general, this full disk is what causes someone to want to run the Disk 
> Balancer, only to find it fails with the error.
> There appears to be nothing you can do to force the Disk Balancer to run at 
> this point, but in the scenarios I saw, some data was removed from the disk 
> and usage dropped below the capacity resolving the issue.
> Can we considered relaxing the above check, and if the usage is greater than 
> the capacity, just set the usage to the capacity so the calculations all work 
> ok?
> Eg something like this:
> {code}
>public void setUsed(long dfsUsedSpace) {
> -Preconditions.checkArgument(dfsUsedSpace < this.getCapacity());
> -this.used = dfsUsedSpace;
> +if (dfsUsedSpace > this.getCapacity()) {
> +  this.used = this.getCapacity();
> +} else {
> +  this.used = dfsUsedSpace;
> +}
>}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-13778) In TestStateAlignmentContextWithHA replace artificial AlignmentContextProxyProvider with real ObserverReadProxyProvider.

2018-08-03 Thread Konstantin Shvachko (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko reassigned HDFS-13778:
--

Assignee: Sherwood Zheng

> In TestStateAlignmentContextWithHA replace artificial 
> AlignmentContextProxyProvider with real ObserverReadProxyProvider.
> 
>
> Key: HDFS-13778
> URL: https://issues.apache.org/jira/browse/HDFS-13778
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Sherwood Zheng
>Priority: Major
>
> TestStateAlignmentContextWithHA uses an artificial 
> AlignmentContextProxyProvider, which was temporary needed for testing. Now 
> that we have real ObserverReadProxyProvider it can take over ACPP. This is 
> also useful for testing the ORPP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-268) Add SCM close container watcher

2018-08-03 Thread Ajay Kumar (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568552#comment-16568552
 ] 

Ajay Kumar commented on HDDS-268:
-

[~xyao], patch v3 rebased with trunk. Changed sleep time in test cases. (They 
are passing locally but failing in jenkins run) 

> Add SCM close container watcher
> ---
>
> Key: HDDS-268
> URL: https://issues.apache.org/jira/browse/HDDS-268
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-268.00.patch, HDDS-268.01.patch, HDDS-268.02.patch, 
> HDDS-268.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-268) Add SCM close container watcher

2018-08-03 Thread Ajay Kumar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-268:

Attachment: HDDS-268.03.patch

> Add SCM close container watcher
> ---
>
> Key: HDDS-268
> URL: https://issues.apache.org/jira/browse/HDDS-268
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-268.00.patch, HDDS-268.01.patch, HDDS-268.02.patch, 
> HDDS-268.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-268) Add SCM close container watcher

2018-08-03 Thread Ajay Kumar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-268:

Attachment: (was: HDDS-268.03.patch)

> Add SCM close container watcher
> ---
>
> Key: HDDS-268
> URL: https://issues.apache.org/jira/browse/HDDS-268
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-268.00.patch, HDDS-268.01.patch, HDDS-268.02.patch, 
> HDDS-268.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-268) Add SCM close container watcher

2018-08-03 Thread Ajay Kumar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-268:

Attachment: HDDS-268.03.patch

> Add SCM close container watcher
> ---
>
> Key: HDDS-268
> URL: https://issues.apache.org/jira/browse/HDDS-268
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-268.00.patch, HDDS-268.01.patch, HDDS-268.02.patch, 
> HDDS-268.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-13776) RBF: Add Storage policies related ClientProtocol APIs

2018-08-03 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568506#comment-16568506
 ] 

Íñigo Goiri edited comment on HDFS-13776 at 8/3/18 5:32 PM:


Thanks [~dibyendu_hadoop] for  [^HDFS-13776-000.patch].
A couple comments:
* Can we move the {{getLocationsForPath()}} chunks inside the 
{{RouterStoragePolicy}} (the erasure coding module does that).
* Fix the javac warning by using assertArrayEquals or so.
* Can we add a javadoc to the new class?
* Minor styles nits:
** Remove the extra space in {{private  final RouterStoragePolicy 
storagePolicy;}}.
** Add an extra line in RouterRpcServer#211 to keep the double space separation.

The tests look good, have you tested this in a cluster or so?


was (Author: elgoiri):
Thanks [~dibyendu_hadoop] for  [^HDFS-13776-000.patch].
A couple comments:
* Can we move the {{getLocationsForPath()}} chunks inside the 
{{RouterStoragePolicy}} (the erasure coding module does that).
* Fix the javac warning by using assertArrayEquals or so.
* Minor styles nits:
** Remove the extra space in {{private  final RouterStoragePolicy 
storagePolicy;}}.
** Add an extra line in RouterRpcServer#211 to keep the double space separation.

The tests look good, have you tested this in a cluster or so?

> RBF: Add Storage policies related ClientProtocol APIs
> -
>
> Key: HDFS-13776
> URL: https://issues.apache.org/jira/browse/HDFS-13776
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Attachments: HDFS-13776-000.patch
>
>
> Currently unsetStoragePolicy and getStoragePolicy are not implemented in 
> RouterRpcServer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-11610) sun.net.spi.nameservice.NameService has moved to a new location

2018-08-03 Thread Akira Ajisaka (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568509#comment-16568509
 ] 

Akira Ajisaka edited comment on HDFS-11610 at 8/3/18 5:32 PM:
--

Thanks [~tasanuma0829] for your first commit as a committer!


was (Author: ajisakaa):
Thanks [~tasanuma0829] for your first commit!

> sun.net.spi.nameservice.NameService has moved to a new location
> ---
>
> Key: HDFS-11610
> URL: https://issues.apache.org/jira/browse/HDFS-11610
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HDFS-11610.001.patch, HDFS-11610.002.patch
>
>
> sun.net.spi.nameservice.NameService was moved to 
> java.net.InetAddress$NameService in Java 9. TestDFSClientFailover uses this 
> class to spy nameservice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-11610) sun.net.spi.nameservice.NameService has moved to a new location

2018-08-03 Thread Akira Ajisaka (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568509#comment-16568509
 ] 

Akira Ajisaka commented on HDFS-11610:
--

Thanks [~tasanuma0829] for your first commit!

> sun.net.spi.nameservice.NameService has moved to a new location
> ---
>
> Key: HDFS-11610
> URL: https://issues.apache.org/jira/browse/HDFS-11610
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HDFS-11610.001.patch, HDFS-11610.002.patch
>
>
> sun.net.spi.nameservice.NameService was moved to 
> java.net.InetAddress$NameService in Java 9. TestDFSClientFailover uses this 
> class to spy nameservice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13776) RBF: Add Storage policies related ClientProtocol APIs

2018-08-03 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568506#comment-16568506
 ] 

Íñigo Goiri commented on HDFS-13776:


Thanks [~dibyendu_hadoop] for  [^HDFS-13776-000.patch].
A couple comments:
* Can we move the {{getLocationsForPath()}} chunks inside the 
{{RouterStoragePolicy}} (the erasure coding module does that).
* Fix the javac warning by using assertArrayEquals or so.
* Minor styles nits:
** Remove the extra space in {{private  final RouterStoragePolicy 
storagePolicy;}}.
** Add an extra line in RouterRpcServer#211 to keep the double space separation.

The tests look good, have you tested this in a cluster or so?

> RBF: Add Storage policies related ClientProtocol APIs
> -
>
> Key: HDFS-13776
> URL: https://issues.apache.org/jira/browse/HDFS-13776
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Attachments: HDFS-13776-000.patch
>
>
> Currently unsetStoragePolicy and getStoragePolicy are not implemented in 
> RouterRpcServer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-265) Move numPendingDeletionBlocks and deleteTransactionId from ContainerData to KeyValueContainerData

2018-08-03 Thread Bharat Viswanadham (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568490#comment-16568490
 ] 

Bharat Viswanadham commented on HDDS-265:
-

Triggered jenkins run.

https://builds.apache.org/job/PreCommit-HDDS-Build/690/console

> Move numPendingDeletionBlocks and deleteTransactionId from ContainerData to 
> KeyValueContainerData
> -
>
> Key: HDDS-265
> URL: https://issues.apache.org/jira/browse/HDDS-265
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.2.1
>Reporter: Hanisha Koneru
>Assignee: LiXin Ge
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-265.000.patch
>
>
> "numPendingDeletionBlocks" and "deleteTransactionId" fields are specific to 
> KeyValueContainers. As such they should be moved to KeyValueContainerData 
> from ContainerData.
> ContainerReport should also be refactored to take in this change. 
> Please refer to [~ljain]'s comment in HDDS-250.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-13738) fsck -list-corruptfileblocks has infinite loop if user is not privileged.

2018-08-03 Thread Yuen-Kuei Hsueh (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568442#comment-16568442
 ] 

Yuen-Kuei Hsueh edited comment on HDFS-13738 at 8/3/18 4:29 PM:


[~jojochuang] Thanks for your advice. I have added some assertions in your test 
case. [^HDFS-13738.002.patch]


was (Author: study):
[~jojochuang] Thanks for your advice. I have added some assertions in your test 
case.

> fsck -list-corruptfileblocks has infinite loop if user is not privileged.
> -
>
> Key: HDFS-13738
> URL: https://issues.apache.org/jira/browse/HDFS-13738
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.6.0, 3.0.0
> Environment: Kerberized Hadoop cluster
>Reporter: Wei-Chiu Chuang
>Assignee: Yuen-Kuei Hsueh
>Priority: Major
> Attachments: HDFS-13738.001.patch, HDFS-13738.002.patch, 
> HDFS-13738.test.patch
>
>
> Found an interesting bug.
> Execute following command as any non-privileged user:
> {noformat}
> # run fsck
> $ hdfs fsck / -list-corruptfileblocks
> {noformat}
> {noformat}
> FSCK ended at Mon Jul 16 15:14:03 PDT 2018 in 1 milliseconds
> Access denied for user systest. Superuser privilege is required
> Fsck on path '/' FAILED
> FSCK ended at Mon Jul 16 15:14:03 PDT 2018 in 0 milliseconds
> Access denied for user systest. Superuser privilege is required
> Fsck on path '/' FAILED
> FSCK ended at Mon Jul 16 15:14:03 PDT 2018 in 1 milliseconds
> Access denied for user systest. Superuser privilege is required
> Fsck on path '/' FAILED
> {noformat}
> Reproducible on Hadoop 3.0.0 as well as 2.6.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13738) fsck -list-corruptfileblocks has infinite loop if user is not privileged.

2018-08-03 Thread Yuen-Kuei Hsueh (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568442#comment-16568442
 ] 

Yuen-Kuei Hsueh commented on HDFS-13738:


[~jojochuang] Thanks for your advice. I have added some assertions in your test 
case.

> fsck -list-corruptfileblocks has infinite loop if user is not privileged.
> -
>
> Key: HDFS-13738
> URL: https://issues.apache.org/jira/browse/HDFS-13738
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.6.0, 3.0.0
> Environment: Kerberized Hadoop cluster
>Reporter: Wei-Chiu Chuang
>Assignee: Yuen-Kuei Hsueh
>Priority: Major
> Attachments: HDFS-13738.001.patch, HDFS-13738.002.patch, 
> HDFS-13738.test.patch
>
>
> Found an interesting bug.
> Execute following command as any non-privileged user:
> {noformat}
> # run fsck
> $ hdfs fsck / -list-corruptfileblocks
> {noformat}
> {noformat}
> FSCK ended at Mon Jul 16 15:14:03 PDT 2018 in 1 milliseconds
> Access denied for user systest. Superuser privilege is required
> Fsck on path '/' FAILED
> FSCK ended at Mon Jul 16 15:14:03 PDT 2018 in 0 milliseconds
> Access denied for user systest. Superuser privilege is required
> Fsck on path '/' FAILED
> FSCK ended at Mon Jul 16 15:14:03 PDT 2018 in 1 milliseconds
> Access denied for user systest. Superuser privilege is required
> Fsck on path '/' FAILED
> {noformat}
> Reproducible on Hadoop 3.0.0 as well as 2.6.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13536) [PROVIDED Storage] HA for InMemoryAliasMap

2018-08-03 Thread JIRA



 [ 
https://issues.apache.org/jira/browse/HDFS-13536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13536:
---
Fix Version/s: 3.2.0

> [PROVIDED Storage] HA for InMemoryAliasMap
> --
>
> Key: HDFS-13536
> URL: https://issues.apache.org/jira/browse/HDFS-13536
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HDFS-13536.001.patch, HDFS-13536.002.patch, 
> HDFS-13536.003.patch
>
>
> Provide HA for the {{InMemoryLevelDBAliasMapServer}} to work with HDFS NN 
> configured in high availability. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13088) Allow HDFS files/blocks to be over-replicated.

2018-08-03 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-13088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568438#comment-16568438
 ] 

Íñigo Goiri commented on HDFS-13088:


I think that adding overreplication as a default of 0 to setReplication makes 
sense.
However, I'm in between changing the method or adding an extra one with the new 
parameter.
The current approach in [^HDFS-13088.001.patch] adds the parameter and changes 
all the other tests.
Maybe we should add a new method instead of changing the existing one.

> Allow HDFS files/blocks to be over-replicated.
> --
>
> Key: HDFS-13088
> URL: https://issues.apache.org/jira/browse/HDFS-13088
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
>Priority: Major
> Attachments: HDFS-13088.001.patch
>
>
> This JIRA is to add a per-file "over-replication" factor to HDFS. As 
> mentioned in HDFS-13069, the over-replication factor will be the excess 
> replicas that will be allowed to exist for a file or block. This is 
> beneficial if the application deems additional replicas for a file are 
> needed. In the case of  HDFS-13069, it would allow copies of data in PROVIDED 
> storage to be cached locally in HDFS in a read-through manner.
> The Namenode will not proactively meet the over-replication i.e., it does not 
> schedule replications if the number of replicas for a block is less than 
> (replication factor + over-replication factor) as long as they are more than 
> the replication factor of the file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13738) fsck -list-corruptfileblocks has infinite loop if user is not privileged.

2018-08-03 Thread Yuen-Kuei Hsueh (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuen-Kuei Hsueh updated HDFS-13738:
---
Attachment: HDFS-13738.002.patch

> fsck -list-corruptfileblocks has infinite loop if user is not privileged.
> -
>
> Key: HDFS-13738
> URL: https://issues.apache.org/jira/browse/HDFS-13738
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.6.0, 3.0.0
> Environment: Kerberized Hadoop cluster
>Reporter: Wei-Chiu Chuang
>Assignee: Yuen-Kuei Hsueh
>Priority: Major
> Attachments: HDFS-13738.001.patch, HDFS-13738.002.patch, 
> HDFS-13738.test.patch
>
>
> Found an interesting bug.
> Execute following command as any non-privileged user:
> {noformat}
> # run fsck
> $ hdfs fsck / -list-corruptfileblocks
> {noformat}
> {noformat}
> FSCK ended at Mon Jul 16 15:14:03 PDT 2018 in 1 milliseconds
> Access denied for user systest. Superuser privilege is required
> Fsck on path '/' FAILED
> FSCK ended at Mon Jul 16 15:14:03 PDT 2018 in 0 milliseconds
> Access denied for user systest. Superuser privilege is required
> Fsck on path '/' FAILED
> FSCK ended at Mon Jul 16 15:14:03 PDT 2018 in 1 milliseconds
> Access denied for user systest. Superuser privilege is required
> Fsck on path '/' FAILED
> {noformat}
> Reproducible on Hadoop 3.0.0 as well as 2.6.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13776) RBF: Add Storage policies related ClientProtocol APIs

2018-08-03 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568355#comment-16568355
 ] 

genericqa commented on HDFS-13776:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 24s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs-rbf generated 1 new + 0 unchanged 
- 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
57s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13776 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934271/HDFS-13776-000.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ca48df0b5a4e 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3426f40 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24694/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24694/testReport/ |
| Max. process+thread count | 952 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24694/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.ap

[jira] [Commented] (HDFS-10240) Race between close/recoverLease leads to missing block

2018-08-03 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-10240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568335#comment-16568335
 ] 

genericqa commented on HDFS-10240:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 19s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 50s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}159m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-10240 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934263/HDFS-10240-002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 62c243cfbd78 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 022592a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24693/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24693/testReport/ |
| Max. process+thread count | 3835 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24693/console |
| Powered b

[jira] [Commented] (HDFS-11610) sun.net.spi.nameservice.NameService has moved to a new location

2018-08-03 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568317#comment-16568317
 ] 

Hudson commented on HDFS-11610:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14706 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14706/])
HDFS-11610. sun.net.spi.nameservice.NameService has moved to a new (tasanuma: 
rev 2b18bb4f37b9eec504b084e11384ec5ede4e0779)
* (edit) hadoop-hdfs-project/hadoop-hdfs/pom.xml


> sun.net.spi.nameservice.NameService has moved to a new location
> ---
>
> Key: HDFS-11610
> URL: https://issues.apache.org/jira/browse/HDFS-11610
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HDFS-11610.001.patch, HDFS-11610.002.patch
>
>
> sun.net.spi.nameservice.NameService was moved to 
> java.net.InetAddress$NameService in Java 9. TestDFSClientFailover uses this 
> class to spy nameservice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13775) Add a conf property to retry dns reverse lookups during datanode registration

2018-08-03 Thread Kihwal Lee (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568303#comment-16568303
 ] 

Kihwal Lee commented on HDFS-13775:
---

It could be meaningless if the host is running a caching daemon (e.g. nscd) and 
its is doing negative caching. 

> Add a conf property to retry dns reverse lookups during datanode registration
> -
>
> Key: HDFS-13775
> URL: https://issues.apache.org/jira/browse/HDFS-13775
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Aniket Mokashi
>Priority: Major
>
> During DN registration, if the ip and the hostname in the DN InetAddress are 
> the same (i.e. reverse name lookup falied; see [the 
> javadoc|http://docs.oracle.com/javase/6/docs/api/java/net/InetAddress.html#getHostName%28%29]),
>  the DN will be disallowed to register, where the DN InetAddress was obtained 
> from the socket. It is a performance improvement added by HDFS-3990 to avoid 
> unnecessary DNS resolutions. However, in an environment where DNS lookups can 
> be flaky this can cause datanode registration to fail permanently.
> We propose to add a conf property to retry DNS reverse lookup during DN 
> registration. Otherwise, clusters with flaky reverse DNS lookup will fail 
> permanently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-11610) sun.net.spi.nameservice.NameService has moved to a new location

2018-08-03 Thread Takanobu Asanuma (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-11610:

  Resolution: Fixed
Target Version/s: 3.2.0
  Status: Resolved  (was: Patch Available)

> sun.net.spi.nameservice.NameService has moved to a new location
> ---
>
> Key: HDFS-11610
> URL: https://issues.apache.org/jira/browse/HDFS-11610
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HDFS-11610.001.patch, HDFS-11610.002.patch
>
>
> sun.net.spi.nameservice.NameService was moved to 
> java.net.InetAddress$NameService in Java 9. TestDFSClientFailover uses this 
> class to spy nameservice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-11610) sun.net.spi.nameservice.NameService has moved to a new location

2018-08-03 Thread Takanobu Asanuma (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568283#comment-16568283
 ] 

Takanobu Asanuma commented on HDFS-11610:
-

Committed to trunk. Thanks for the contribution, [~ajisakaa]!

> sun.net.spi.nameservice.NameService has moved to a new location
> ---
>
> Key: HDFS-11610
> URL: https://issues.apache.org/jira/browse/HDFS-11610
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HDFS-11610.001.patch, HDFS-11610.002.patch
>
>
> sun.net.spi.nameservice.NameService was moved to 
> java.net.InetAddress$NameService in Java 9. TestDFSClientFailover uses this 
> class to spy nameservice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13738) fsck -list-corruptfileblocks has infinite loop if user is not privileged.

2018-08-03 Thread Wei-Chiu Chuang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568271#comment-16568271
 ] 

Wei-Chiu Chuang commented on HDFS-13738:


Thanks for the patch, the fix looks good to me. Would you also like to 
contribute a test case as well?
For the starter, you can add test methods in TestFsck class.

> fsck -list-corruptfileblocks has infinite loop if user is not privileged.
> -
>
> Key: HDFS-13738
> URL: https://issues.apache.org/jira/browse/HDFS-13738
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.6.0, 3.0.0
> Environment: Kerberized Hadoop cluster
>Reporter: Wei-Chiu Chuang
>Assignee: Yuen-Kuei Hsueh
>Priority: Major
> Attachments: HDFS-13738.001.patch, HDFS-13738.test.patch
>
>
> Found an interesting bug.
> Execute following command as any non-privileged user:
> {noformat}
> # run fsck
> $ hdfs fsck / -list-corruptfileblocks
> {noformat}
> {noformat}
> FSCK ended at Mon Jul 16 15:14:03 PDT 2018 in 1 milliseconds
> Access denied for user systest. Superuser privilege is required
> Fsck on path '/' FAILED
> FSCK ended at Mon Jul 16 15:14:03 PDT 2018 in 0 milliseconds
> Access denied for user systest. Superuser privilege is required
> Fsck on path '/' FAILED
> FSCK ended at Mon Jul 16 15:14:03 PDT 2018 in 1 milliseconds
> Access denied for user systest. Superuser privilege is required
> Fsck on path '/' FAILED
> {noformat}
> Reproducible on Hadoop 3.0.0 as well as 2.6.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13728) Disk Balancer should not fail if volume usage is greater than capacity

2018-08-03 Thread Stephen O'Donnell (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568264#comment-16568264
 ] 

Stephen O'Donnell commented on HDFS-13728:
--

I think its ready for review now. Please have a look and I can own this one if 
you want.

> Disk Balancer should not fail if volume usage is greater than capacity
> --
>
> Key: HDFS-13728
> URL: https://issues.apache.org/jira/browse/HDFS-13728
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: diskbalancer
>Affects Versions: 3.0.3
>Reporter: Stephen O'Donnell
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HDFS-13728.001.patch
>
>
> We have seen a couple of scenarios where the disk balancer fails because a 
> datanode reports more spaced used on a disk than its capacity, which should 
> not be possible.
> This is due to the check below in DiskBalancerVolume.java:
> {code}
>   public void setUsed(long dfsUsedSpace) {
> Preconditions.checkArgument(dfsUsedSpace < this.getCapacity(),
> "DiskBalancerVolume.setUsed: dfsUsedSpace(%s) < capacity(%s)",
> dfsUsedSpace, getCapacity());
> this.used = dfsUsedSpace;
>   }
> {code}
> While I agree that it should not be possible for a DN to report more usage on 
> a volume than its capacity, there seems to be some issue that causes this to 
> occur sometimes.
> In general, this full disk is what causes someone to want to run the Disk 
> Balancer, only to find it fails with the error.
> There appears to be nothing you can do to force the Disk Balancer to run at 
> this point, but in the scenarios I saw, some data was removed from the disk 
> and usage dropped below the capacity resolving the issue.
> Can we considered relaxing the above check, and if the usage is greater than 
> the capacity, just set the usage to the capacity so the calculations all work 
> ok?
> Eg something like this:
> {code}
>public void setUsed(long dfsUsedSpace) {
> -Preconditions.checkArgument(dfsUsedSpace < this.getCapacity());
> -this.used = dfsUsedSpace;
> +if (dfsUsedSpace > this.getCapacity()) {
> +  this.used = this.getCapacity();
> +} else {
> +  this.used = dfsUsedSpace;
> +}
>}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-13728) Disk Balancer should not fail if volume usage is greater than capacity

2018-08-03 Thread Gabor Bota (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568254#comment-16568254
 ] 

Gabor Bota edited comment on HDFS-13728 at 8/3/18 2:23 PM:
---

[~sodonnell], I'd rather review this if you have already made and uploaded a 
patch for it.
Do you think it's not ready for a review? Do you think that it will involve 
that much additional work?
If you feel like finishing it please assign to yourself, but if you don't have 
enough time for it, make a comment about that and I'll jump on this.


was (Author: gabor.bota):
[~sodonnell], I'd rather review this if you have already made and uploaded a 
patch for it.
Do you think it's not ready for a review? Do you think that it will involve 
that much additional work?
If you feel like finishing it, please assign to yourself, but if you don't have 
enough time for it make a comment about that and I'll jump on this.

> Disk Balancer should not fail if volume usage is greater than capacity
> --
>
> Key: HDFS-13728
> URL: https://issues.apache.org/jira/browse/HDFS-13728
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: diskbalancer
>Affects Versions: 3.0.3
>Reporter: Stephen O'Donnell
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HDFS-13728.001.patch
>
>
> We have seen a couple of scenarios where the disk balancer fails because a 
> datanode reports more spaced used on a disk than its capacity, which should 
> not be possible.
> This is due to the check below in DiskBalancerVolume.java:
> {code}
>   public void setUsed(long dfsUsedSpace) {
> Preconditions.checkArgument(dfsUsedSpace < this.getCapacity(),
> "DiskBalancerVolume.setUsed: dfsUsedSpace(%s) < capacity(%s)",
> dfsUsedSpace, getCapacity());
> this.used = dfsUsedSpace;
>   }
> {code}
> While I agree that it should not be possible for a DN to report more usage on 
> a volume than its capacity, there seems to be some issue that causes this to 
> occur sometimes.
> In general, this full disk is what causes someone to want to run the Disk 
> Balancer, only to find it fails with the error.
> There appears to be nothing you can do to force the Disk Balancer to run at 
> this point, but in the scenarios I saw, some data was removed from the disk 
> and usage dropped below the capacity resolving the issue.
> Can we considered relaxing the above check, and if the usage is greater than 
> the capacity, just set the usage to the capacity so the calculations all work 
> ok?
> Eg something like this:
> {code}
>public void setUsed(long dfsUsedSpace) {
> -Preconditions.checkArgument(dfsUsedSpace < this.getCapacity());
> -this.used = dfsUsedSpace;
> +if (dfsUsedSpace > this.getCapacity()) {
> +  this.used = this.getCapacity();
> +} else {
> +  this.used = dfsUsedSpace;
> +}
>}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13728) Disk Balancer should not fail if volume usage is greater than capacity

2018-08-03 Thread Gabor Bota (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568254#comment-16568254
 ] 

Gabor Bota commented on HDFS-13728:
---

[~sodonnell], I'd rather review this if you have already made and uploaded a 
patch for it.
Do you think it's not ready for a review? Do you think that it will involve 
that much additional work?
If you feel like finishing it, please assign to yourself, but if you don't have 
enough time for it make a comment about that and I'll jump on this.

> Disk Balancer should not fail if volume usage is greater than capacity
> --
>
> Key: HDFS-13728
> URL: https://issues.apache.org/jira/browse/HDFS-13728
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: diskbalancer
>Affects Versions: 3.0.3
>Reporter: Stephen O'Donnell
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HDFS-13728.001.patch
>
>
> We have seen a couple of scenarios where the disk balancer fails because a 
> datanode reports more spaced used on a disk than its capacity, which should 
> not be possible.
> This is due to the check below in DiskBalancerVolume.java:
> {code}
>   public void setUsed(long dfsUsedSpace) {
> Preconditions.checkArgument(dfsUsedSpace < this.getCapacity(),
> "DiskBalancerVolume.setUsed: dfsUsedSpace(%s) < capacity(%s)",
> dfsUsedSpace, getCapacity());
> this.used = dfsUsedSpace;
>   }
> {code}
> While I agree that it should not be possible for a DN to report more usage on 
> a volume than its capacity, there seems to be some issue that causes this to 
> occur sometimes.
> In general, this full disk is what causes someone to want to run the Disk 
> Balancer, only to find it fails with the error.
> There appears to be nothing you can do to force the Disk Balancer to run at 
> this point, but in the scenarios I saw, some data was removed from the disk 
> and usage dropped below the capacity resolving the issue.
> Can we considered relaxing the above check, and if the usage is greater than 
> the capacity, just set the usage to the capacity so the calculations all work 
> ok?
> Eg something like this:
> {code}
>public void setUsed(long dfsUsedSpace) {
> -Preconditions.checkArgument(dfsUsedSpace < this.getCapacity());
> -this.used = dfsUsedSpace;
> +if (dfsUsedSpace > this.getCapacity()) {
> +  this.used = this.getCapacity();
> +} else {
> +  this.used = dfsUsedSpace;
> +}
>}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13776) RBF: Add Storage policies related ClientProtocol APIs

2018-08-03 Thread Dibyendu Karmakar (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568252#comment-16568252
 ] 

Dibyendu Karmakar commented on HDFS-13776:
--

Hi [~elgoiri] ,

Added the patch. With this changes, I have moved all the storage policy related 
APIs to a new module.

Could you please take a look. 

> RBF: Add Storage policies related ClientProtocol APIs
> -
>
> Key: HDFS-13776
> URL: https://issues.apache.org/jira/browse/HDFS-13776
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Attachments: HDFS-13776-000.patch
>
>
> Currently unsetStoragePolicy and getStoragePolicy are not implemented in 
> RouterRpcServer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13776) RBF: Add Storage policies related ClientProtocol APIs

2018-08-03 Thread Dibyendu Karmakar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dibyendu Karmakar updated HDFS-13776:
-
Status: Patch Available  (was: Open)

> RBF: Add Storage policies related ClientProtocol APIs
> -
>
> Key: HDFS-13776
> URL: https://issues.apache.org/jira/browse/HDFS-13776
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Attachments: HDFS-13776-000.patch
>
>
> Currently unsetStoragePolicy and getStoragePolicy are not implemented in 
> RouterRpcServer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13776) RBF: Add Storage policies related ClientProtocol APIs

2018-08-03 Thread Dibyendu Karmakar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dibyendu Karmakar updated HDFS-13776:
-
Attachment: HDFS-13776-000.patch

> RBF: Add Storage policies related ClientProtocol APIs
> -
>
> Key: HDFS-13776
> URL: https://issues.apache.org/jira/browse/HDFS-13776
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Attachments: HDFS-13776-000.patch
>
>
> Currently unsetStoragePolicy and getStoragePolicy are not implemented in 
> RouterRpcServer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-11610) sun.net.spi.nameservice.NameService has moved to a new location

2018-08-03 Thread Takanobu Asanuma (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568179#comment-16568179
 ] 

Takanobu Asanuma commented on HDFS-11610:
-

The failed tests succeeded in my local environment. LGTM, +1.

> sun.net.spi.nameservice.NameService has moved to a new location
> ---
>
> Key: HDFS-11610
> URL: https://issues.apache.org/jira/browse/HDFS-11610
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HDFS-11610.001.patch, HDFS-11610.002.patch
>
>
> sun.net.spi.nameservice.NameService was moved to 
> java.net.InetAddress$NameService in Java 9. TestDFSClientFailover uses this 
> class to spy nameservice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10240) Race between close/recoverLease leads to missing block

2018-08-03 Thread Jinglun (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-10240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568158#comment-16568158
 ] 

Jinglun commented on HDFS-10240:


hi [~jojochuang], I thought patch HDFS-10240-001.patch still worked and won't 
make any conflicts with HDFS-12886. BlockManager.commitOrCompleteLastBlock is 
only called in 3 situations:
1. a complete rpc
2. a addBlock rpc
3. a commitBlockSynchronization rpc by BlockRecoveryWorker
HDFS-12886 deals with situation 3, ignores minReplication when doing block 
recovery. And HDFS-10240-001.patch deals with situation 1 and 2, stops client 
from completing a under recovery block. 
In situation 3, blk.unstate will be converted to UC in method 
FSNamesystem.commitBlockSynchronization, just before calling method 
BlockManager.commitOrCompleteLastBlock, so it won't be affected by 10240.patch. 
1 & 2 don't do the convert so they will be prevented. 
And about the Assert Error, with this patch client will got an IOE with message 
"Commit or complete block xxx whereas it is under recovery.", looks good.
I did a simple merge of HDFS-10240-001.patch and HDFS-10240.test.patch, with a 
little change in UT . Appreciating your comments [~jojochuang] [~sinago]

> Race between close/recoverLease leads to missing block
> --
>
> Key: HDFS-10240
> URL: https://issues.apache.org/jira/browse/HDFS-10240
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-10240 scenarios.jpg, HDFS-10240-001.patch, 
> HDFS-10240-002.patch, HDFS-10240.test.patch
>
>
> We got a missing block in our cluster, and logs related to the missing block 
> are as follows:
> 2016-03-28,10:00:06,188 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocateBlock: XX. BP-219149063-10.108.84.25-1446859315800 
> blk_1226490256_153006345{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
> 2016-03-28,10:00:06,205 INFO BlockStateChange: BLOCK* 
> blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
>  recovery started, 
> primary=ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]
> 2016-03-28,10:00:06,205 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File XX has not been closed. Lease 
> recovery is in progress. RecoveryId = 153006357 for block 
> blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
> 2016-03-28,10:00:06,248 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* 
> checkFileProgress: blk_1226490256_153006345{blockUCState=COMMITTED, 
> primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
>  has not reached minimal replication 1
> 2016-03-28,10:00:06,358 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.5.53:11402 is added to 
> blk_1226490256_153006345{blockUCState=COMMITTED, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
>  size 139
> 2016-03-28,10:00:06,441 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.5.44:11402 is added to blk_1226490256_153006345 size 
> 139
> 2016-03-28,10:00:06,660 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.6.14:11402 is added to blk_1226490256_153006345 size 
> 139
> 2016-03-28,10:00:08,808 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=BP-219149063-10.108.84.25-1446859315800:blk_1226490256_1530063

[jira] [Updated] (HDFS-10240) Race between close/recoverLease leads to missing block

2018-08-03 Thread Jinglun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-10240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-10240:
---
Attachment: HDFS-10240-002.patch

> Race between close/recoverLease leads to missing block
> --
>
> Key: HDFS-10240
> URL: https://issues.apache.org/jira/browse/HDFS-10240
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-10240 scenarios.jpg, HDFS-10240-001.patch, 
> HDFS-10240-002.patch, HDFS-10240.test.patch
>
>
> We got a missing block in our cluster, and logs related to the missing block 
> are as follows:
> 2016-03-28,10:00:06,188 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocateBlock: XX. BP-219149063-10.108.84.25-1446859315800 
> blk_1226490256_153006345{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
> 2016-03-28,10:00:06,205 INFO BlockStateChange: BLOCK* 
> blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
>  recovery started, 
> primary=ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]
> 2016-03-28,10:00:06,205 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File XX has not been closed. Lease 
> recovery is in progress. RecoveryId = 153006357 for block 
> blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
> 2016-03-28,10:00:06,248 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* 
> checkFileProgress: blk_1226490256_153006345{blockUCState=COMMITTED, 
> primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
>  has not reached minimal replication 1
> 2016-03-28,10:00:06,358 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.5.53:11402 is added to 
> blk_1226490256_153006345{blockUCState=COMMITTED, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
>  size 139
> 2016-03-28,10:00:06,441 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.5.44:11402 is added to blk_1226490256_153006345 size 
> 139
> 2016-03-28,10:00:06,660 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.6.14:11402 is added to blk_1226490256_153006345 size 
> 139
> 2016-03-28,10:00:08,808 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=BP-219149063-10.108.84.25-1446859315800:blk_1226490256_153006345,
>  newgenerationstamp=153006357, newlength=139, newtargets=[10.114.6.14:11402, 
> 10.114.5.53:11402, 10.114.5.44:11402], closeFile=true, deleteBlock=false)
> 2016-03-28,10:00:08,836 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.6.14:11402 by /10.114.6.14 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> 2016-03-28,10:00:08,836 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.5.53:11402 by /10.114.5.53 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> 2016-03-28,10:00:08,837 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.5.44:11402 by /10.114.5.44 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> From the log, I guess this is what has happened in order:
> 1  Process A open a file F for

[jira] [Commented] (HDFS-11610) sun.net.spi.nameservice.NameService has moved to a new location

2018-08-03 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568117#comment-16568117
 ] 

genericqa commented on HDFS-11610:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
40m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  8s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}134m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-11610 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12917824/HDFS-11610.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux edcfccd12f67 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 40ab8ee |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24692/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24692/testReport/ |
| Max. process+thread count | 3949 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24692/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> sun.net.spi.nameservice.NameService has moved to a new location
> ---
>
> Key: HDFS-11610
>

[jira] [Commented] (HDDS-230) ContainerStateMachine should provide readStateMachineData api to read data if Containers with required during replication

2018-08-03 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568089#comment-16568089
 ] 

genericqa commented on HDDS-230:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
56s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
27s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
58s{color} | {color:green} container-service in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
39s{color} | {color:green} ozone-manager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}123m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JI

[jira] [Updated] (HDDS-322) Restructure ChunkGroupOutputStream and ChunkOutputStream

2018-08-03 Thread Shashikant Banerjee (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-322:
-
Description: 
Currently, ChunkOutputStream allocate a chunk size buffer to cache client data. 
The idea here is to allocate the buffer in ChunkGroupOutputStream and pass it 
to underlying  ChunkOutputStream so as to make reclaiming of the uncommitted 
leftover data in the buffer and reallocating to next Block while handling  
CLOSE_CONTAINER_EXCEPTION in ozone client becomes simpler. This Jira will also 
add the code to close the underlying ChunkOutputStream as soon as the complete 
block data is written.

This Jira will also modify the PutKey response to contain the committed block 
length to ozone client  to do some validation checks.

  was:
Currently, ChunkOutputStream allocate a chunk size buffer to cache client data. 
The idea here is to allocate the buffer in ChunkGroupOutputStream and pass it 
to underlying  ChunkOutputStream so as to make reclaiming of the uncommitted 
leftover data in the buffer and reallocating to next Block while handling  
CLOSE_CONTAINER_EXCEPTION in ozone client becomes simpler. This Jira will also 
add the code to close the underlying stream 

This Jira will also modify the PutKey response to contain the committed block 
length to ozone client  to do some validation checks.


> Restructure ChunkGroupOutputStream and ChunkOutputStream
> 
>
> Key: HDDS-322
> URL: https://issues.apache.org/jira/browse/HDDS-322
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.2.1
>
>
> Currently, ChunkOutputStream allocate a chunk size buffer to cache client 
> data. The idea here is to allocate the buffer in ChunkGroupOutputStream and 
> pass it to underlying  ChunkOutputStream so as to make reclaiming of the 
> uncommitted leftover data in the buffer and reallocating to next Block while 
> handling  CLOSE_CONTAINER_EXCEPTION in ozone client becomes simpler. This 
> Jira will also add the code to close the underlying ChunkOutputStream as soon 
> as the complete block data is written.
> This Jira will also modify the PutKey response to contain the committed block 
> length to ozone client  to do some validation checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-322) Restructure ChunkGroupOutputStream and ChunkOutputStream

2018-08-03 Thread Shashikant Banerjee (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-322:
-
Description: 
Currently, ChunkOutputStream allocate a chunk size buffer to cache client data. 
The idea here is to allocate the buffer in ChunkGroupOutputStream and pass it 
to underlying  ChunkOutputStream so as to make reclaiming of the uncommitted 
leftover data in the buffer and reallocating to next Block while handling  
CLOSE_CONTAINER_EXCEPTION in ozone client becomes simpler. This Jira will also 
add the code to close the underlying stream 

This Jira will also modify the PutKey response to contain the committed block 
length to ozone client  to do some validation checks.

  was:Currently, ChunkOutputStream allocate a chunk size buffer to cache client 
data. The idea here is to allocate the buffer in ChunkGroupOutputStream and 
pass it to underlying  ChunkOutputStream so as to make reclaiming of the 
uncommitted leftover data in the buffer and reallocating to next Block while 
handling  CLOSE_CONTAINER_EXCEPTION in ozone client becomes simpler. This Jira 
will also modify the PutKey response to contain the committed block length to 
ozone client  to do some validation checks.


> Restructure ChunkGroupOutputStream and ChunkOutputStream
> 
>
> Key: HDDS-322
> URL: https://issues.apache.org/jira/browse/HDDS-322
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.2.1
>
>
> Currently, ChunkOutputStream allocate a chunk size buffer to cache client 
> data. The idea here is to allocate the buffer in ChunkGroupOutputStream and 
> pass it to underlying  ChunkOutputStream so as to make reclaiming of the 
> uncommitted leftover data in the buffer and reallocating to next Block while 
> handling  CLOSE_CONTAINER_EXCEPTION in ozone client becomes simpler. This 
> Jira will also add the code to close the underlying stream 
> This Jira will also modify the PutKey response to contain the committed block 
> length to ozone client  to do some validation checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 123 matches

Mail list logo