[jira] [Updated] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-18 Thread Lisheng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-14313:
---
Attachment: HDFS-14313.007.patch

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-18 Thread Lisheng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-14313:
---
Attachment: (was: HDFS-14313.007.patch)

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1654) Ensure container state on datanode gets synced to disk whenever state change happens

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1654?focusedWorklogId=278885=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278885
 ]

ASF GitHub Bot logged work on HDDS-1654:


Author: ASF GitHub Bot
Created on: 18/Jul/19 10:36
Start Date: 18/Jul/19 10:36
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on issue #923: HDDS-1654. Ensure 
container state on datanode gets synced to disk whennever state change happens.
URL: https://github.com/apache/hadoop/pull/923#issuecomment-512762475
 
 
   The unit test failures are not related and acceptance test results show it 
all passed in the details. I am going to commit this patch.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 278885)
Time Spent: 1h 40m  (was: 1.5h)

> Ensure container state on datanode gets synced to disk whenever state change 
> happens
> 
>
> Key: HDDS-1654
> URL: https://issues.apache.org/jira/browse/HDDS-1654
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently, whenever there is a container state change, it updates the 
> container but doesn't sync.
> The idea is here to is to force sync the state to disk everytime there is a 
> state change.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1780) TestFailureHandlingByClient tests are flaky

2019-07-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887861#comment-16887861
 ] 

Hudson commented on HDDS-1780:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16947 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16947/])
HDDS-1780. TestFailureHandlingByClient tests are flaky. Contributed by 
(shashikant: rev ccceedb432bc2379e4480f8a9c5ebb181531c04e)
* (add) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestMultiBlockWritesWithDnFailures.java
* (edit) 
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/XceiverClientGrpc.java
* (edit) 
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/BlockOutputStream.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestFailureHandlingByClient.java


> TestFailureHandlingByClient tests are flaky
> ---
>
> Key: HDDS-1780
> URL: https://issues.apache.org/jira/browse/HDDS-1780
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The tests seem to fail bcoz , when the datanode goes down with stale node 
> interval being set to a low value, containers may get closed early and client 
> writes might fail with closed container exception rather than pipeline 
> failure/Timeout exceptions as excepted in the tests. The fix made here is to 
> tune the stale node interval.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-18 Thread Lisheng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-14313:
---
Attachment: HDFS-14313.007.patch

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1780) TestFailureHandlingByClient tests are flaky

2019-07-18 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-1780:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> TestFailureHandlingByClient tests are flaky
> ---
>
> Key: HDDS-1780
> URL: https://issues.apache.org/jira/browse/HDDS-1780
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The tests seem to fail bcoz , when the datanode goes down with stale node 
> interval being set to a low value, containers may get closed early and client 
> writes might fail with closed container exception rather than pipeline 
> failure/Timeout exceptions as excepted in the tests. The fix made here is to 
> tune the stale node interval.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1780) TestFailureHandlingByClient tests are flaky

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1780?focusedWorklogId=278883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278883
 ]

ASF GitHub Bot logged work on HDDS-1780:


Author: ASF GitHub Bot
Created on: 18/Jul/19 10:33
Start Date: 18/Jul/19 10:33
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on issue #1073: HDDS-1780. 
TestFailureHandlingByClient tests are flaky.
URL: https://github.com/apache/hadoop/pull/1073#issuecomment-512761446
 
 
   Thanks @mukul1987 @adoroszlai and @supratimdeka for the review. I have 
committed this change to trunk. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 278883)
Time Spent: 1h 50m  (was: 1h 40m)

> TestFailureHandlingByClient tests are flaky
> ---
>
> Key: HDDS-1780
> URL: https://issues.apache.org/jira/browse/HDDS-1780
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The tests seem to fail bcoz , when the datanode goes down with stale node 
> interval being set to a low value, containers may get closed early and client 
> writes might fail with closed container exception rather than pipeline 
> failure/Timeout exceptions as excepted in the tests. The fix made here is to 
> tune the stale node interval.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1780) TestFailureHandlingByClient tests are flaky

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1780?focusedWorklogId=278882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278882
 ]

ASF GitHub Bot logged work on HDDS-1780:


Author: ASF GitHub Bot
Created on: 18/Jul/19 10:32
Start Date: 18/Jul/19 10:32
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1073: HDDS-1780. 
TestFailureHandlingByClient tests are flaky.
URL: https://github.com/apache/hadoop/pull/1073
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 278882)
Time Spent: 1h 40m  (was: 1.5h)

> TestFailureHandlingByClient tests are flaky
> ---
>
> Key: HDDS-1780
> URL: https://issues.apache.org/jira/browse/HDDS-1780
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The tests seem to fail bcoz , when the datanode goes down with stale node 
> interval being set to a low value, containers may get closed early and client 
> writes might fail with closed container exception rather than pipeline 
> failure/Timeout exceptions as excepted in the tests. The fix made here is to 
> tune the stale node interval.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-18 Thread Lisheng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887859#comment-16887859
 ] 

Lisheng Sun commented on HDFS-14313:


Thank [~linyiqun] for your suggestions. 

 {quote}
why not use dataset lock?
{quote}
I don't use FsDatasetImpl#datasetLock , Because FsDatasetImpl#addBlockPool with 
datasetLock call FsDatasetImpl#deepCopyReplica in another Thread.  According to 
your suggestion, I use Collections.unmodifiableSet to make replica info is not 
allowed to be modified outside.
And I have updated this patch. Could you help reiview it? Thank you again.


> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13783) Balancer: make balancer to be a long service process for easy to monitor it.

2019-07-18 Thread Chen Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887857#comment-16887857
 ] 

Chen Zhang commented on HDFS-13783:
---

Thanks [~xkrogen] for your detailed comment and suggestion, I'll refine the 
code and submit a patch later

BTW, should I update document in a separate JIRA?

> Balancer: make balancer to be a long service process for easy to monitor it.
> 
>
> Key: HDFS-13783
> URL: https://issues.apache.org/jira/browse/HDFS-13783
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer  mover
>Reporter: maobaolong
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-13783-001.patch, HDFS-13783-002.patch
>
>
> If we have a long service process of balancer, like namenode, datanode, we 
> can get metrics of balancer, the metrics can tell us the status of balancer, 
> the amount of block it has moved, 
> We can get or set the balance plan by the balancer webUI. So many things we 
> can do if we have a long balancer service process.
> So, shall we start to plan the new Balancer? Hope this feature can enter the 
> next release of hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14257) NPE when given the Invalid path to create target dir

2019-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887828#comment-16887828
 ] 

Hadoop QA commented on HDFS-14257:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
2s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
32s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 17s{color} | {color:orange} root: The patch generated 3 new + 199 unchanged 
- 1 fixed = 202 total (was 200) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 32s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}101m 20s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
50s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}215m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestFixKerberosTicketOrder |
|   | hadoop.fs.TestFsShellCopy |
|   | hadoop.security.TestRaceWhenRelogin |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14257 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12975129/HDFS-14257.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8a9b6622901c 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 
22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Created] (HDDS-1824) IllegalArgumentException in NetworkTopologyImpl causes SCM to shutdown

2019-07-18 Thread Lokesh Jain (JIRA)
Lokesh Jain created HDDS-1824:
-

 Summary: IllegalArgumentException in NetworkTopologyImpl causes 
SCM to shutdown
 Key: HDDS-1824
 URL: https://issues.apache.org/jira/browse/HDDS-1824
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: SCM
Reporter: Lokesh Jain


 

 
{code:java}
2019-07-18 02:22:18,005 ERROR 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Exception in 
Replication Monitor Thread.
java.lang.IllegalArgumentException: Affinity node /default-rack/10.17.213.25 is 
not a member of topology
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.checkAffinityNode(NetworkTopologyImpl.java:780)
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.chooseRandom(NetworkTopologyImpl.java:408)
at 
org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseNode(SCMContainerPlacementRackAware.java:242)
at 
org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseDatanodes(SCMContainerPlacementRackAware.java:168)
at 
org.apache.hadoop.hdds.scm.container.ReplicationManager.handleUnderReplicatedContainer(ReplicationManager.java:487)
at 
org.apache.hadoop.hdds.scm.container.ReplicationManager.processContainer(ReplicationManager.java:293)
at 
java.util.concurrent.ConcurrentHashMap$KeySetView.forEach(ConcurrentHashMap.java:4649)
at java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
at 
org.apache.hadoop.hdds.scm.container.ReplicationManager.run(ReplicationManager.java:205)
at java.lang.Thread.run(Thread.java:745)
2019-07-18 02:22:18,008 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1: java.lang.IllegalArgumentException: Affinity node 
/default-rack/10.17.213.25 is not a member of topology
2019-07-18 02:22:18,010 INFO 
org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter: SHUTDOWN_MSG:
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14625) Make DefaultAuditLogger class in FSnamesystem to Abstract

2019-07-18 Thread hemanthboyina (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887807#comment-16887807
 ] 

hemanthboyina commented on HDFS-14625:
--

yes [~elgoiri] , TestDirectoryScanner  was even failing in other JIRAs

> Make DefaultAuditLogger class in FSnamesystem to Abstract 
> --
>
> Key: HDFS-14625
> URL: https://issues.apache.org/jira/browse/HDFS-14625
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14625 (1).patch, HDFS-14625(2).patch, 
> HDFS-14625.003.patch, HDFS-14625.patch
>
>
> As per +HDFS-13270+  Audit logger for Router , we can make DefaultAuditLogger 
>  in FSnamesystem to be Abstract and common



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1767) ContainerStateMachine should have its own executors for executing applyTransaction calls

2019-07-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887783#comment-16887783
 ] 

Hudson commented on HDDS-1767:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16946 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16946/])
HDDS-1767: ContainerStateMachine should have its own executors for (github: rev 
23e9bebe13bd2c79494b1caaa22763914b38b74f)
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/XceiverServerRatis.java


> ContainerStateMachine should have its own executors for executing 
> applyTransaction calls
> 
>
> Key: HDDS-1767
> URL: https://issues.apache.org/jira/browse/HDDS-1767
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently ContainerStateMachine uses the executors provided by 
> XceiverServerRatis for executing applyTransaction calls. This would result in 
> two or more ContainerStateMachine to share the same set of executors. Delay 
> or load in one ContainerStateMachine would adversely affect the performance 
> of other state machines in such a case. It is better to have separate set of 
> executors for each ContainerStateMachine.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1767) ContainerStateMachine should have its own executors for executing applyTransaction calls

2019-07-18 Thread Lokesh Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HDDS-1767:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> ContainerStateMachine should have its own executors for executing 
> applyTransaction calls
> 
>
> Key: HDDS-1767
> URL: https://issues.apache.org/jira/browse/HDDS-1767
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently ContainerStateMachine uses the executors provided by 
> XceiverServerRatis for executing applyTransaction calls. This would result in 
> two or more ContainerStateMachine to share the same set of executors. Delay 
> or load in one ContainerStateMachine would adversely affect the performance 
> of other state machines in such a case. It is better to have separate set of 
> executors for each ContainerStateMachine.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1767) ContainerStateMachine should have its own executors for executing applyTransaction calls

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1767?focusedWorklogId=278849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278849
 ]

ASF GitHub Bot logged work on HDDS-1767:


Author: ASF GitHub Bot
Created on: 18/Jul/19 09:18
Start Date: 18/Jul/19 09:18
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on issue #1087: HDDS-1767: 
ContainerStateMachine should have its own executors for executing 
applyTransaction calls
URL: https://github.com/apache/hadoop/pull/1087#issuecomment-512736112
 
 
   @mukul1987 Thanks for reviewing the PR! I have merged it with trunk.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 278849)
Time Spent: 40m  (was: 0.5h)

> ContainerStateMachine should have its own executors for executing 
> applyTransaction calls
> 
>
> Key: HDDS-1767
> URL: https://issues.apache.org/jira/browse/HDDS-1767
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently ContainerStateMachine uses the executors provided by 
> XceiverServerRatis for executing applyTransaction calls. This would result in 
> two or more ContainerStateMachine to share the same set of executors. Delay 
> or load in one ContainerStateMachine would adversely affect the performance 
> of other state machines in such a case. It is better to have separate set of 
> executors for each ContainerStateMachine.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1767) ContainerStateMachine should have its own executors for executing applyTransaction calls

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1767?focusedWorklogId=278848=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278848
 ]

ASF GitHub Bot logged work on HDDS-1767:


Author: ASF GitHub Bot
Created on: 18/Jul/19 09:18
Start Date: 18/Jul/19 09:18
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1087: HDDS-1767: 
ContainerStateMachine should have its own executors for executing 
applyTransaction calls
URL: https://github.com/apache/hadoop/pull/1087
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 278848)
Time Spent: 0.5h  (was: 20m)

> ContainerStateMachine should have its own executors for executing 
> applyTransaction calls
> 
>
> Key: HDDS-1767
> URL: https://issues.apache.org/jira/browse/HDDS-1767
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently ContainerStateMachine uses the executors provided by 
> XceiverServerRatis for executing applyTransaction calls. This would result in 
> two or more ContainerStateMachine to share the same set of executors. Delay 
> or load in one ContainerStateMachine would adversely affect the performance 
> of other state machines in such a case. It is better to have separate set of 
> executors for each ContainerStateMachine.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1811) Prometheus metrics are broken for datanodes due to an invalid metric

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1811?focusedWorklogId=278844=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278844
 ]

ASF GitHub Bot logged work on HDDS-1811:


Author: ASF GitHub Bot
Created on: 18/Jul/19 09:14
Start Date: 18/Jul/19 09:14
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1118: HDDS-1811. 
Prometheus metrics are broken
URL: https://github.com/apache/hadoop/pull/1118#issuecomment-512734481
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 82 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 27 | Maven dependency ordering for branch |
   | +1 | mvninstall | 533 | trunk passed |
   | +1 | compile | 279 | trunk passed |
   | +1 | checkstyle | 75 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 937 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 169 | trunk passed |
   | 0 | spotbugs | 422 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 636 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 26 | Maven dependency ordering for patch |
   | +1 | mvninstall | 725 | the patch passed |
   | +1 | compile | 366 | the patch passed |
   | +1 | javac | 366 | the patch passed |
   | -0 | checkstyle | 45 | hadoop-hdds: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 897 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 206 | the patch passed |
   | +1 | findbugs | 677 | the patch passed |
   ||| _ Other Tests _ |
   | -1 | unit | 423 | hadoop-hdds in the patch failed. |
   | -1 | unit | 2712 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 57 | The patch does not generate ASF License warnings. |
   | | | 9152 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdds.scm.block.TestBlockManager |
   |   | 
hadoop.hdds.scm.container.placement.algorithms.TestContainerPlacementFactory |
   |   | hadoop.ozone.client.rpc.TestSecureOzoneRpcClient |
   |   | hadoop.hdds.scm.pipeline.TestNodeFailure |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClient |
   |   | hadoop.ozone.client.rpc.TestOzoneAtRestEncryption |
   |   | hadoop.ozone.container.server.TestSecureContainerServer |
   |   | hadoop.ozone.container.ozoneimpl.TestSecureOzoneContainer |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis |
   |   | hadoop.hdds.scm.pipeline.TestRatisPipelineCreateAndDestory |
   |   | hadoop.ozone.client.rpc.TestReadRetries |
   |   | hadoop.ozone.client.rpc.TestBlockOutputStream |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=18.09.7 Server=18.09.7 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1118/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1118 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 466c89fad20b 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 
22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 3dc256e |
   | Default Java | 1.8.0_212 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1118/1/artifact/out/diff-checkstyle-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1118/1/artifact/out/patch-unit-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1118/1/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1118/1/testReport/ |
   | Max. process+thread count | 4151 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/framework hadoop-hdds/container-service U: 
hadoop-hdds |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1118/1/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use 

[jira] [Work logged] (HDDS-1481) Cleanup BasicOzoneFileSystem#mkdir

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1481?focusedWorklogId=278845=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278845
 ]

ASF GitHub Bot logged work on HDDS-1481:


Author: ASF GitHub Bot
Created on: 18/Jul/19 09:14
Start Date: 18/Jul/19 09:14
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on issue #1114: HDDS-1481: 
Cleanup BasicOzoneFileSystem#mkdir
URL: https://github.com/apache/hadoop/pull/1114#issuecomment-512734684
 
 
   @anuengineer Thanks for reviewing the changes! I have merged the PR with 
trunk.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 278845)
Time Spent: 40m  (was: 0.5h)

> Cleanup BasicOzoneFileSystem#mkdir
> --
>
> Key: HDDS-1481
> URL: https://issues.apache.org/jira/browse/HDDS-1481
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently BasicOzoneFileSystem#mkdir does not have the optimizations made in 
> HDDS-1300. The changes for this function were missed in HDDS-1460.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1481) Cleanup BasicOzoneFileSystem#mkdir

2019-07-18 Thread Lokesh Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HDDS-1481:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Cleanup BasicOzoneFileSystem#mkdir
> --
>
> Key: HDDS-1481
> URL: https://issues.apache.org/jira/browse/HDDS-1481
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently BasicOzoneFileSystem#mkdir does not have the optimizations made in 
> HDDS-1300. The changes for this function were missed in HDDS-1460.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1481) Cleanup BasicOzoneFileSystem#mkdir

2019-07-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887776#comment-16887776
 ] 

Hudson commented on HDDS-1481:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16945 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16945/])
HDDS-1481: Cleanup BasicOzoneFileSystem#mkdir (#1114) (github: rev 
53a4c22b403ddf3e0d31c5c3b4a494e58e7b1234)
* (edit) 
hadoop-ozone/ozonefs/src/main/java/org/apache/hadoop/fs/ozone/BasicOzoneFileSystem.java


> Cleanup BasicOzoneFileSystem#mkdir
> --
>
> Key: HDDS-1481
> URL: https://issues.apache.org/jira/browse/HDDS-1481
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently BasicOzoneFileSystem#mkdir does not have the optimizations made in 
> HDDS-1300. The changes for this function were missed in HDDS-1460.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1481) Cleanup BasicOzoneFileSystem#mkdir

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1481?focusedWorklogId=278841=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278841
 ]

ASF GitHub Bot logged work on HDDS-1481:


Author: ASF GitHub Bot
Created on: 18/Jul/19 09:10
Start Date: 18/Jul/19 09:10
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1114: HDDS-1481: 
Cleanup BasicOzoneFileSystem#mkdir
URL: https://github.com/apache/hadoop/pull/1114
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 278841)
Time Spent: 0.5h  (was: 20m)

> Cleanup BasicOzoneFileSystem#mkdir
> --
>
> Key: HDDS-1481
> URL: https://issues.apache.org/jira/browse/HDDS-1481
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently BasicOzoneFileSystem#mkdir does not have the optimizations made in 
> HDDS-1300. The changes for this function were missed in HDDS-1460.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1481) Cleanup BasicOzoneFileSystem#mkdir

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1481?focusedWorklogId=278839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278839
 ]

ASF GitHub Bot logged work on HDDS-1481:


Author: ASF GitHub Bot
Created on: 18/Jul/19 09:10
Start Date: 18/Jul/19 09:10
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on issue #1114: HDDS-1481: 
Cleanup BasicOzoneFileSystem#mkdir
URL: https://github.com/apache/hadoop/pull/1114#issuecomment-512733139
 
 
   @anuengineer Thanks for reviewing the PR! I agree too. The acceptance test 
failure does not seem to be related. I will merge it with trunk.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 278839)
Time Spent: 20m  (was: 10m)

> Cleanup BasicOzoneFileSystem#mkdir
> --
>
> Key: HDDS-1481
> URL: https://issues.apache.org/jira/browse/HDDS-1481
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently BasicOzoneFileSystem#mkdir does not have the optimizations made in 
> HDDS-1300. The changes for this function were missed in HDDS-1460.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10927) Lease Recovery: File not getting closed on HDFS when block write operation fails

2019-07-18 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887740#comment-16887740
 ] 

He Xiaoqiao commented on HDFS-10927:


Thanks for your reporting, this is interesting issue.
It is clear that not correct {{replicaInfo#NumBytes}} when meet exception or 
error in my opinion.
I am just confused that, why it cause failed write while one node in pipeline 
meet exception/error. IIUC, when one node meet exception it will ACK exception 
through the pipeline back and Client will pick out this node then reconstruct 
pipeline. 
[~ngoswami],[~zhangchen] would you like to offer more information about HBase 
write flow, I just wonder if there are some case that DFSClient could not write 
complete correctly and trigger lease recovery.

> Lease Recovery: File not getting closed on HDFS when block write operation 
> fails
> 
>
> Key: HDFS-10927
> URL: https://issues.apache.org/jira/browse/HDFS-10927
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.1
>Reporter: Nitin Goswami
>Priority: Major
>
> HDFS was unable to close a file when block write operation failed because of 
> too high disk usage.
> Scenario:
> HBase was writing WAL logs on HDFS and the disk usage was too high at that 
> time. While writing these WAL logs, one of the blocks writes operation failed 
> with the following exception:
> 2016-09-13 10:00:49,978 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Exception for 
> BP-337226066-192.168.193.217-1468912147102:blk_1074859607_1160899
> java.net.SocketTimeoutException: 6 millis timeout while waiting for 
> channel to be ready for read. ch : java.nio.channels.SocketChannel[connected 
> local=/192.168.194.144:50010 remote=/192.168.192.162:43105]
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
> at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
> at java.io.BufferedInputStream.fill(Unknown Source)
> at java.io.BufferedInputStream.read1(Unknown Source)
> at java.io.BufferedInputStream.read(Unknown Source)
> at java.io.DataInputStream.read(Unknown Source)
> at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:472)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:849)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:807)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:251)
> at java.lang.Thread.run(Unknown Source)
> After this exception, HBase tried to close/rollover the WAL file but that 
> call also failed and WAL file couldn't be closed. After this HBase closed the 
> region server
> After some time, Lease Recovery got triggered for this file and following 
> exceptions starts occurring:
> 2016-09-13 11:51:11,743 WARN 
> org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed to 
> obtain replica info for block 
> (=BP-337226066-192.168.193.217-1468912147102:blk_1074859607_1161187) from 
> datanode (=DatanodeInfoWithStorage[192.168.192.162:50010,null,null])
> java.io.IOException: THIS IS NOT SUPPOSED TO HAPPEN: getBytesOnDisk() < 
> getVisibleLength(), rip=ReplicaBeingWritten, blk_1074859607_1161187, RBW
>   getNumBytes() = 45524696
>   getBytesOnDisk()  = 45483527
>   getVisibleLength()= 45511557
>   getVolume()   = /opt/reflex/data/yarn/datanode/current
>   getBlockFile()= 
> /opt/reflex/data/yarn/datanode/current/BP-337226066-192.168.193.217-1468912147102/current/rbw/blk_1074859607
>   bytesAcked=45511557
>   bytesOnDisk=45483527
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.initReplicaRecovery(FsDatasetImpl.java:2278)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.initReplicaRecovery(FsDatasetImpl.java:2254)
> at 
> 

[jira] [Commented] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887738#comment-16887738
 ] 

Hadoop QA commented on HDFS-14034:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 57s{color} | {color:orange} hadoop-hdfs-project: The patch generated 3 new + 
713 unchanged - 8 fixed = 716 total (was 721) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
50s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 38s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
33s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}160m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.8 Server=18.09.8 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14034 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12975122/HDFS-14034.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1543e15e9a2e 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| 

[jira] [Updated] (HDDS-1803) shellcheck.sh does not work on Mac

2019-07-18 Thread Doroszlai, Attila (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doroszlai, Attila updated HDDS-1803:

Target Version/s: 0.5.0

> shellcheck.sh does not work on Mac
> --
>
> Key: HDDS-1803
> URL: https://issues.apache.org/jira/browse/HDDS-1803
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.4.1
>Reporter: Doroszlai, Attila
>Assignee: Doroszlai, Attila
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> # {{shellcheck.sh}} does not work on Mac
> {code}
> find: -executable: unknown primary or operator
> {code}
> # {{$OUTPUT_FILE}} only contains problems from {{hadoop-ozone}}, not from 
> {{hadoop-hdds}}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14660) [SBN Read] ObserverNameNode should throw StandbyException for requests not from ObserverProxyProvider

2019-07-18 Thread Ayush Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887723#comment-16887723
 ] 

Ayush Saxena commented on HDFS-14660:
-

Thanx Chao!!!

bq. For this case, we should check whether the stateId in the incoming RPC 
header is set or not, and throw an StandbyException when it is not. 

I guess there is a specific behavior too, for when state ID isn't set i.e -1, 
The Observer serves the request without being bothered of the state. I am not 
sure if somebody had a use case for that. But to handle this scenario. This 
behavior would have to be changed. 

Well there is discussion at HDFS-14636, You may follow up there, with some 
proposals or some solution, Well I guess [~xkrogen] had some concerns there, 
Can conclude once he confirms. 

> [SBN Read] ObserverNameNode should throw StandbyException for requests not 
> from ObserverProxyProvider
> -
>
> Key: HDFS-14660
> URL: https://issues.apache.org/jira/browse/HDFS-14660
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
>
> In a HDFS HA cluster with consistent reads enabled (HDFS-12943), clients 
> could be using either {{ObserverReadProxyProvider}}, 
> {{ConfiguredProxyProvider}}, or something else. Since observer is just a 
> special type of SBN and we allow transitions between them, a client NOT using 
> {{ObserverReadProxyProvider}} will need to have 
> {{dfs.ha.namenodes.}} include all NameNodes in the cluster, and 
> therefore, it may send request to a observer node.
> For this case, we should check whether the {{stateId}} in the incoming RPC 
> header is set or not, and throw an {{StandbyException}} when it is not. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1821) BlockOutputStream#watchForCommit fails with UnsupportedOperationException when one DN is down

2019-07-18 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee resolved HDDS-1821.
---
Resolution: Duplicate
  Assignee: Shashikant Banerjee

> BlockOutputStream#watchForCommit fails with UnsupportedOperationException 
> when one DN is down
> -
>
> Key: HDDS-1821
> URL: https://issues.apache.org/jira/browse/HDDS-1821
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Nanda kumar
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: blockade
>
> When one of the datanode from the ratis pipeline is excluded by introducing 
> network failure, the client write is failing with the following exception
> {noformat}
> 2019-07-18 07:13:33 WARN  XceiverClientRatis:262 - 3 way commit failed on 
> pipeline Pipeline[ Id: b338512c-1a3b-4ae6-b89c-7b7737d9bd93, Nodes: 
> ce90cf89-0444-45bf-8c49-a126d8da5a5f{ip: 192.168.240.4, host: 
> ozoneblockade_datanode_2.ozoneblockade_default, networkLocation: 
> /default-rack, certSerialId: null}fa65a457-155d-4bf3-8d1b-b0e11ec157ae{ip: 
> 192.168.240.6, host: ozoneblockade_datanode_3.ozoneblockade_default, 
> networkLocation: /default-rack, certSerialId: 
> null}c5785c99-7dc2-4afc-9054-2efa28a41e7e{ip: 192.168.240.2, host: 
> ozoneblockade_datanode_1.ozoneblockade_default, networkLocation: 
> /default-rack, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
> E java.util.concurrent.ExecutionException: 
> org.apache.ratis.protocol.NotReplicatedException: Request with call Id 2 and 
> log index 9 is not yet replicated to ALL_COMMITTED
> E at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
> E at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2022)
> E at 
> org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:259)
> E at 
> org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchForCommit(CommitWatcher.java:194)
> E at 
> org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchOnLastIndex(CommitWatcher.java:157)
> E at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.watchForCommit(BlockOutputStream.java:348)
> E at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFlush(BlockOutputStream.java:480)
> E at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:494)
> E at 
> org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:143)
> E at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:434)
> E at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:472)
> E at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60)
> E at 
> org.apache.hadoop.ozone.freon.RandomKeyGenerator.createKey(RandomKeyGenerator.java:706)
> E at 
> org.apache.hadoop.ozone.freon.RandomKeyGenerator.access$1100(RandomKeyGenerator.java:88)
> E at 
> org.apache.hadoop.ozone.freon.RandomKeyGenerator$ObjectCreator.run(RandomKeyGenerator.java:609)
> E at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> E at 
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> E at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> E at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> E at java.base/java.lang.Thread.run(Thread.java:834)
> E Caused by: org.apache.ratis.protocol.NotReplicatedException: 
> Request with call Id 2 and log index 9 is not yet replicated to ALL_COMMITTED
> E at 
> org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:245)
> E at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:254)
> E at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:249)
> E at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:421)
> E at 
> org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
> E at 
> org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)

[jira] [Commented] (HDFS-14621) Distcp can not preserve timestamp with -delete option

2019-07-18 Thread Ayush Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887716#comment-16887716
 ] 

Ayush Saxena commented on HDFS-14621:
-

Thanx [~pilchard] for the patch. Seems fair enough.
[~ste...@apache.org] can you too give a check?

> Distcp can not preserve timestamp with -delete  option
> --
>
> Key: HDFS-14621
> URL: https://issues.apache.org/jira/browse/HDFS-14621
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.7.7, 3.1.2
>Reporter: ludun
>Priority: Major
> Attachments: HDFS-14261.001.patch, HDFS-14621.002.patch, 
> HDFS-14621.003.patch
>
>
> Use distcp with  -prbugpcaxt and -delete to copy data between cluster.
> hadoop distcp -Dmapreduce.job.queuename="QueueA" -prbugpcaxt -update -delete  
> hdfs://sourcecluster/user/hive/warehouse/sum.db 
> hdfs://destcluster/user/hive/warehouse/sum.db
> After distcp, we found  the timestamp of dest is different from source, and 
> the timestamp of some directory was the time distcp running.
> Check the code of distcp, in CopyCommitter, it preserves time first then 
> process -delete option which will change the timestamp of dest directory. So 
> we should process -delete option first. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1823) RatisPipelineProvider#initializePipeline logging needs to be verbose on failures/errors

2019-07-18 Thread Mukul Kumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-1823:

Summary: RatisPipelineProvider#initializePipeline logging needs to be 
verbose on failures/errors  (was: RatisPipelineProvider#initializePipeline 
logging needs to be verbose on debugging)

> RatisPipelineProvider#initializePipeline logging needs to be verbose on 
> failures/errors
> ---
>
> Key: HDDS-1823
> URL: https://issues.apache.org/jira/browse/HDDS-1823
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Priority: Major
>
> RatisPipelineProvider#initializePipeline does not logs the information about 
> pipeline details and the failed nodes when initializePipeline fails. The 
> debugging needs to be verbose to help in debugging.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-1823) RatisPipelineProvider#initializePipeline logging needs to be verbose on debugging

2019-07-18 Thread Mukul Kumar Singh (JIRA)
Mukul Kumar Singh created HDDS-1823:
---

 Summary: RatisPipelineProvider#initializePipeline logging needs to 
be verbose on debugging
 Key: HDDS-1823
 URL: https://issues.apache.org/jira/browse/HDDS-1823
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: SCM
Affects Versions: 0.4.0
Reporter: Mukul Kumar Singh


RatisPipelineProvider#initializePipeline does not logs the information about 
pipeline details and the failed nodes when initializePipeline fails. The 
debugging needs to be verbose to help in debugging.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1821) BlockOutputStream#watchForCommit fails with UnsupportedOperationException when one DN is down

2019-07-18 Thread Nanda kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-1821:
--
Labels: blockade  (was: )

> BlockOutputStream#watchForCommit fails with UnsupportedOperationException 
> when one DN is down
> -
>
> Key: HDDS-1821
> URL: https://issues.apache.org/jira/browse/HDDS-1821
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Nanda kumar
>Priority: Major
>  Labels: blockade
>
> When one of the datanode from the ratis pipeline is excluded by introducing 
> network failure, the client write is failing with the following exception
> {noformat}
> 2019-07-18 07:13:33 WARN  XceiverClientRatis:262 - 3 way commit failed on 
> pipeline Pipeline[ Id: b338512c-1a3b-4ae6-b89c-7b7737d9bd93, Nodes: 
> ce90cf89-0444-45bf-8c49-a126d8da5a5f{ip: 192.168.240.4, host: 
> ozoneblockade_datanode_2.ozoneblockade_default, networkLocation: 
> /default-rack, certSerialId: null}fa65a457-155d-4bf3-8d1b-b0e11ec157ae{ip: 
> 192.168.240.6, host: ozoneblockade_datanode_3.ozoneblockade_default, 
> networkLocation: /default-rack, certSerialId: 
> null}c5785c99-7dc2-4afc-9054-2efa28a41e7e{ip: 192.168.240.2, host: 
> ozoneblockade_datanode_1.ozoneblockade_default, networkLocation: 
> /default-rack, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
> E java.util.concurrent.ExecutionException: 
> org.apache.ratis.protocol.NotReplicatedException: Request with call Id 2 and 
> log index 9 is not yet replicated to ALL_COMMITTED
> E at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
> E at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2022)
> E at 
> org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:259)
> E at 
> org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchForCommit(CommitWatcher.java:194)
> E at 
> org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchOnLastIndex(CommitWatcher.java:157)
> E at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.watchForCommit(BlockOutputStream.java:348)
> E at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFlush(BlockOutputStream.java:480)
> E at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:494)
> E at 
> org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:143)
> E at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:434)
> E at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:472)
> E at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60)
> E at 
> org.apache.hadoop.ozone.freon.RandomKeyGenerator.createKey(RandomKeyGenerator.java:706)
> E at 
> org.apache.hadoop.ozone.freon.RandomKeyGenerator.access$1100(RandomKeyGenerator.java:88)
> E at 
> org.apache.hadoop.ozone.freon.RandomKeyGenerator$ObjectCreator.run(RandomKeyGenerator.java:609)
> E at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> E at 
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> E at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> E at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> E at java.base/java.lang.Thread.run(Thread.java:834)
> E Caused by: org.apache.ratis.protocol.NotReplicatedException: 
> Request with call Id 2 and log index 9 is not yet replicated to ALL_COMMITTED
> E at 
> org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:245)
> E at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:254)
> E at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:249)
> E at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:421)
> E at 
> org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
> E at 
> org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
> E at 
> 

[jira] [Updated] (HDDS-1821) BlockOutputStream#watchForCommit fails with UnsupportedOperationException when one DN is down

2019-07-18 Thread Nanda kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-1821:
--
Description: 
When one of the datanode from the ratis pipeline is excluded by introducing 
network failure, the client write is failing with the following exception
{noformat}
2019-07-18 07:13:33 WARN  XceiverClientRatis:262 - 3 way commit failed on 
pipeline Pipeline[ Id: b338512c-1a3b-4ae6-b89c-7b7737d9bd93, Nodes: 
ce90cf89-0444-45bf-8c49-a126d8da5a5f{ip: 192.168.240.4, host: 
ozoneblockade_datanode_2.ozoneblockade_default, networkLocation: /default-rack, 
certSerialId: null}fa65a457-155d-4bf3-8d1b-b0e11ec157ae{ip: 192.168.240.6, 
host: ozoneblockade_datanode_3.ozoneblockade_default, networkLocation: 
/default-rack, certSerialId: null}c5785c99-7dc2-4afc-9054-2efa28a41e7e{ip: 
192.168.240.2, host: ozoneblockade_datanode_1.ozoneblockade_default, 
networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:THREE, 
State:OPEN]
E java.util.concurrent.ExecutionException: 
org.apache.ratis.protocol.NotReplicatedException: Request with call Id 2 and 
log index 9 is not yet replicated to ALL_COMMITTED
E   at 
java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
E   at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2022)
E   at 
org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:259)
E   at 
org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchForCommit(CommitWatcher.java:194)
E   at 
org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchOnLastIndex(CommitWatcher.java:157)
E   at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.watchForCommit(BlockOutputStream.java:348)
E   at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFlush(BlockOutputStream.java:480)
E   at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:494)
E   at 
org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:143)
E   at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:434)
E   at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:472)
E   at 
org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60)
E   at 
org.apache.hadoop.ozone.freon.RandomKeyGenerator.createKey(RandomKeyGenerator.java:706)
E   at 
org.apache.hadoop.ozone.freon.RandomKeyGenerator.access$1100(RandomKeyGenerator.java:88)
E   at 
org.apache.hadoop.ozone.freon.RandomKeyGenerator$ObjectCreator.run(RandomKeyGenerator.java:609)
E   at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
E   at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
E   at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
E   at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
E   at java.base/java.lang.Thread.run(Thread.java:834)
E Caused by: org.apache.ratis.protocol.NotReplicatedException: Request 
with call Id 2 and log index 9 is not yet replicated to ALL_COMMITTED
E   at 
org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:245)
E   at 
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:254)
E   at 
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:249)
E   at 
org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:421)
E   at 
org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
E   at 
org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
E   at 
org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:519)
E   at 
org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
E   at 
org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
E   ... 3 more
E 2019-07-18 07:13:33 INFO  XceiverClientRatis:280 - Could not commit 
index 9 on pipeline Pipeline[ Id: b338512c-1a3b-4ae6-b89c-7b7737d9bd93, Nodes: 
ce90cf89-0444-45bf-8c49-a126d8da5a5f{ip: 192.168.240.4, host: 
ozoneblockade_datanode_2.ozoneblockade_default, networkLocation: 

[jira] [Created] (HDDS-1822) NPE in SCMCommonPolicy.chooseDatanodes

2019-07-18 Thread Mukul Kumar Singh (JIRA)
Mukul Kumar Singh created HDDS-1822:
---

 Summary: NPE in SCMCommonPolicy.chooseDatanodes
 Key: HDDS-1822
 URL: https://issues.apache.org/jira/browse/HDDS-1822
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: SCM
Affects Versions: 0.4.0
Reporter: Mukul Kumar Singh


Exception is in SCMCommonPolicy.chooseDatanodes
{code}
java.lang.NullPointerException
at java.util.Objects.requireNonNull(Objects.java:203)
at java.util.ArrayList.removeAll(ArrayList.java:693)
at 
org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMCommonPolicy.chooseDatanodes(SCMCommonPolicy.java:112)
at 
org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRandom.chooseDatanodes(SCMContainerPlacementRandom.java:74)
at 
org.apache.hadoop.hdds.scm.container.placement.algorithms.TestContainerPlacementFactory.testDefaultPolicy(TestContainerPlacementFactory.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
{code}

cc : [~xyao] [~Sammi]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1821) BlockOutputStream#watchForCommit fails with UnsupportedOperationException when one DN is down

2019-07-18 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887710#comment-16887710
 ] 

Nanda kumar commented on HDDS-1821:
---

The issue can be reproduced with 
{{blockade/test_blockade_client_failure.py::test_client_failure_isolate_one_datanode}}

> BlockOutputStream#watchForCommit fails with UnsupportedOperationException 
> when one DN is down
> -
>
> Key: HDDS-1821
> URL: https://issues.apache.org/jira/browse/HDDS-1821
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Nanda kumar
>Priority: Major
>
> When one of the datanode from the ratis pipeline is excluded by introducing 
> network failure, the client write is failing with the following exception
> {noformat}
> 2019-07-18 07:13:33 WARN  XceiverClientRatis:262 - 3 way commit failed on 
> pipeline Pipeline[ Id: b338512c-1a3b-4ae6-b89c-7b7737d9bd93, Nodes: 
> ce90cf89-0444-45bf-8c49-a126d8da5a5f{ip: 192.168.240.4, host: 
> ozoneblockade_datanode_2.ozoneblockade_default, networkLocation: 
> /default-rack, certSerialId: null}fa65a457-155d-4bf3-8d1b-b0e11ec157ae{ip: 
> 192.168.240.6, host: ozoneblockade_datanode_3.ozoneblockade_default, 
> networkLocation: /default-rack, certSerialId: 
> null}c5785c99-7dc2-4afc-9054-2efa28a41e7e{ip: 192.168.240.2, host: 
> ozoneblockade_datanode_1.ozoneblockade_default, networkLocation: 
> /default-rack, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
> E java.util.concurrent.ExecutionException: 
> org.apache.ratis.protocol.NotReplicatedException: Request with call Id 2 and 
> log index 9 is not yet replicated to ALL_COMMITTED
> E at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
> E at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2022)
> E at 
> org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:259)
> E at 
> org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchForCommit(CommitWatcher.java:194)
> E at 
> org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchOnLastIndex(CommitWatcher.java:157)
> E at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.watchForCommit(BlockOutputStream.java:348)
> E at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFlush(BlockOutputStream.java:480)
> E at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:494)
> E at 
> org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:143)
> E at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:434)
> E at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:472)
> E at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60)
> E at 
> org.apache.hadoop.ozone.freon.RandomKeyGenerator.createKey(RandomKeyGenerator.java:706)
> E at 
> org.apache.hadoop.ozone.freon.RandomKeyGenerator.access$1100(RandomKeyGenerator.java:88)
> E at 
> org.apache.hadoop.ozone.freon.RandomKeyGenerator$ObjectCreator.run(RandomKeyGenerator.java:609)
> E at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> E at 
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> E at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> E at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> E at java.base/java.lang.Thread.run(Thread.java:834)
> E Caused by: org.apache.ratis.protocol.NotReplicatedException: 
> Request with call Id 2 and log index 9 is not yet replicated to ALL_COMMITTED
> E at 
> org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:245)
> E at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:254)
> E at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:249)
> E at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:421)
> E at 
> org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
> E at 
> 

[jira] [Created] (HDDS-1821) BlockOutputStream#watchForCommit fails with UnsupportedOperationException when one DN is down

2019-07-18 Thread Nanda kumar (JIRA)
Nanda kumar created HDDS-1821:
-

 Summary: BlockOutputStream#watchForCommit fails with 
UnsupportedOperationException when one DN is down
 Key: HDDS-1821
 URL: https://issues.apache.org/jira/browse/HDDS-1821
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Client
Reporter: Nanda kumar


When one of the datanode from the ratis pipeline is excluded by introducing 
network failure, the client write is failing with the following exception
{noformat}
2019-07-18 07:13:33 WARN  XceiverClientRatis:262 - 3 way commit failed on 
pipeline Pipeline[ Id: b338512c-1a3b-4ae6-b89c-7b7737d9bd93, Nodes: 
ce90cf89-0444-45bf-8c49-a126d8da5a5f{ip: 192.168.240.4, host: 
ozoneblockade_datanode_2.ozoneblockade_default, networkLocation: /default-rack, 
certSerialId: null}fa65a457-155d-4bf3-8d1b-b0e11ec157ae{ip: 192.168.240.6, 
host: ozoneblockade_datanode_3.ozoneblockade_default, networkLocation: 
/default-rack, certSerialId: null}c5785c99-7dc2-4afc-9054-2efa28a41e7e{ip: 
192.168.240.2, host: ozoneblockade_datanode_1.ozoneblockade_default, 
networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:THREE, 
State:OPEN]
E java.util.concurrent.ExecutionException: 
org.apache.ratis.protocol.NotReplicatedException: Request with call Id 2 and 
log index 9 is not yet replicated to ALL_COMMITTED
E   at 
java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
E   at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2022)
E   at 
org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:259)
E   at 
org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchForCommit(CommitWatcher.java:194)
E   at 
org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchOnLastIndex(CommitWatcher.java:157)
E   at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.watchForCommit(BlockOutputStream.java:348)
E   at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFlush(BlockOutputStream.java:480)
E   at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:494)
E   at 
org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:143)
E   at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:434)
E   at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:472)
E   at 
org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60)
E   at 
org.apache.hadoop.ozone.freon.RandomKeyGenerator.createKey(RandomKeyGenerator.java:706)
E   at 
org.apache.hadoop.ozone.freon.RandomKeyGenerator.access$1100(RandomKeyGenerator.java:88)
E   at 
org.apache.hadoop.ozone.freon.RandomKeyGenerator$ObjectCreator.run(RandomKeyGenerator.java:609)
E   at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
E   at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
E   at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
E   at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
E   at java.base/java.lang.Thread.run(Thread.java:834)
E Caused by: org.apache.ratis.protocol.NotReplicatedException: Request 
with call Id 2 and log index 9 is not yet replicated to ALL_COMMITTED
E   at 
org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:245)
E   at 
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:254)
E   at 
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:249)
E   at 
org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:421)
E   at 
org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
E   at 
org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
E   at 
org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:519)
E   at 
org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
E   at 
org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
E   ... 3 more
E 2019-07-18 07:13:33 INFO  XceiverClientRatis:280 - 

[jira] [Work logged] (HDDS-1713) ReplicationManager fail to find proper node topology based on Datanode details from heartbeat

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1713?focusedWorklogId=278786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278786
 ]

ASF GitHub Bot logged work on HDDS-1713:


Author: ASF GitHub Bot
Created on: 18/Jul/19 07:21
Start Date: 18/Jul/19 07:21
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1112: HDDS-1713. 
ReplicationManager fail to find proper node topology based…
URL: https://github.com/apache/hadoop/pull/1112#issuecomment-512697445
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 74 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 1 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 7 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 24 | Maven dependency ordering for branch |
   | +1 | mvninstall | 472 | trunk passed |
   | +1 | compile | 282 | trunk passed |
   | +1 | checkstyle | 77 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 966 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 178 | trunk passed |
   | 0 | spotbugs | 403 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 627 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 36 | Maven dependency ordering for patch |
   | -1 | mvninstall | 97 | hadoop-hdds in the patch failed. |
   | -1 | mvninstall | 212 | hadoop-ozone in the patch failed. |
   | -1 | compile | 154 | hadoop-ozone in the patch failed. |
   | -1 | cc | 154 | hadoop-ozone in the patch failed. |
   | -1 | javac | 154 | hadoop-ozone in the patch failed. |
   | +1 | checkstyle | 87 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 720 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 156 | the patch passed |
   | -1 | findbugs | 251 | hadoop-ozone in the patch failed. |
   ||| _ Other Tests _ |
   | -1 | unit | 329 | hadoop-hdds in the patch failed. |
   | -1 | unit | 277 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 37 | The patch does not generate ASF License warnings. |
   | | | 5510 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdds.scm.container.placement.algorithms.TestContainerPlacementFactory |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=18.09.8 Server=18.09.8 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1112/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1112 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle cc |
   | uname | Linux 0a32b44a126f 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 
08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 3dc256e |
   | Default Java | 1.8.0_212 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1112/3/artifact/out/patch-mvninstall-hadoop-hdds.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1112/3/artifact/out/patch-mvninstall-hadoop-ozone.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1112/3/artifact/out/patch-compile-hadoop-ozone.txt
 |
   | cc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1112/3/artifact/out/patch-compile-hadoop-ozone.txt
 |
   | javac | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1112/3/artifact/out/patch-compile-hadoop-ozone.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1112/3/artifact/out/patch-findbugs-hadoop-ozone.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1112/3/artifact/out/patch-unit-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1112/3/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1112/3/testReport/ |
   | Max. process+thread count | 1338 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/container-service 
hadoop-hdds/server-scm hadoop-ozone/integration-test hadoop-ozone/ozone-manager 
U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1112/3/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically 

[jira] [Work logged] (HDDS-1782) Add an option to MiniOzoneChaosCluster to read files multiple times.

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1782?focusedWorklogId=278780=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278780
 ]

ASF GitHub Bot logged work on HDDS-1782:


Author: ASF GitHub Bot
Created on: 18/Jul/19 07:03
Start Date: 18/Jul/19 07:03
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on issue #1076: HDDS-1782. Add an 
option to MiniOzoneChaosCluster to read files multiple times. Contributed by 
Mukul Kumar Singh.
URL: https://github.com/apache/hadoop/pull/1076#issuecomment-512692092
 
 
   Thanks @mukul1987 for working on this. The changes look good. I have one 
minor comment:
   Can you add some more description on class TestProbability to clarify what 
we are trying to achieve with it?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 278780)
Time Spent: 1h 40m  (was: 1.5h)

> Add an option to MiniOzoneChaosCluster to read files multiple times.
> 
>
> Key: HDDS-1782
> URL: https://issues.apache.org/jira/browse/HDDS-1782
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Right now MiniOzoneChaosCluster writes a file/ reads it and deletes it 
> immediately. This jira proposes to add an option to read the file multiple 
> time in MiniOzoneChaosCluster.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1780) TestFailureHandlingByClient tests are flaky

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1780?focusedWorklogId=278776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278776
 ]

ASF GitHub Bot logged work on HDDS-1780:


Author: ASF GitHub Bot
Created on: 18/Jul/19 06:53
Start Date: 18/Jul/19 06:53
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1073: HDDS-1780. 
TestFailureHandlingByClient tests are flaky.
URL: https://github.com/apache/hadoop/pull/1073#issuecomment-512689340
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 69 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 20 | Maven dependency ordering for branch |
   | +1 | mvninstall | 462 | trunk passed |
   | +1 | compile | 249 | trunk passed |
   | +1 | checkstyle | 69 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 894 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 155 | trunk passed |
   | 0 | spotbugs | 306 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 495 | trunk passed |
   | -0 | patch | 348 | Used diff version of patch file. Binary files and 
potentially other changes not applied. Please rebase and squash commits if 
necessary. |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 31 | Maven dependency ordering for patch |
   | +1 | mvninstall | 424 | the patch passed |
   | +1 | compile | 256 | the patch passed |
   | +1 | javac | 256 | the patch passed |
   | +1 | checkstyle | 74 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 730 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 169 | the patch passed |
   | +1 | findbugs | 574 | the patch passed |
   ||| _ Other Tests _ |
   | -1 | unit | 336 | hadoop-hdds in the patch failed. |
   | -1 | unit | 2014 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 43 | The patch does not generate ASF License warnings. |
   | | | 7226 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdds.scm.container.placement.algorithms.TestContainerPlacementFactory |
   |   | hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient |
   |   | hadoop.ozone.client.rpc.TestMultiBlockWritesWithDnFailures |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClient |
   |   | hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException |
   |   | hadoop.ozone.client.rpc.TestBCSID |
   |   | hadoop.ozone.container.ozoneimpl.TestSecureOzoneContainer |
   |   | hadoop.ozone.client.rpc.TestFailureHandlingByClient |
   |   | hadoop.hdds.scm.pipeline.TestRatisPipelineCreateAndDestory |
   |   | hadoop.ozone.container.server.TestSecureContainerServer |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=18.09.8 Server=18.09.8 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1073/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1073 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 4fe98145f0c4 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 
08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 3dc256e |
   | Default Java | 1.8.0_212 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1073/3/artifact/out/patch-unit-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1073/3/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1073/3/testReport/ |
   | Max. process+thread count | 5297 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/client hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1073/3/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact 

[jira] [Work logged] (HDDS-1779) TestWatchForCommit tests are flaky

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1779?focusedWorklogId=278771=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278771
 ]

ASF GitHub Bot logged work on HDDS-1779:


Author: ASF GitHub Bot
Created on: 18/Jul/19 06:50
Start Date: 18/Jul/19 06:50
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1071: HDDS-1779. 
TestWatchForCommit tests are flaky.
URL: https://github.com/apache/hadoop/pull/1071#issuecomment-512688678
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 36 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | +1 | mvninstall | 485 | trunk passed |
   | +1 | compile | 266 | trunk passed |
   | +1 | checkstyle | 70 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 807 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 163 | trunk passed |
   | 0 | spotbugs | 317 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 510 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | +1 | mvninstall | 437 | the patch passed |
   | +1 | compile | 258 | the patch passed |
   | +1 | javac | 258 | the patch passed |
   | +1 | checkstyle | 80 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 685 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 163 | the patch passed |
   | +1 | findbugs | 526 | the patch passed |
   ||| _ Other Tests _ |
   | -1 | unit | 300 | hadoop-hdds in the patch failed. |
   | -1 | unit | 1690 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 51 | The patch does not generate ASF License warnings. |
   | | | 6705 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdds.scm.container.placement.algorithms.TestContainerPlacementFactory |
   |   | hadoop.ozone.client.rpc.TestOzoneAtRestEncryption |
   |   | hadoop.ozone.TestSecureOzoneCluster |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClient |
   |   | hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException |
   |   | hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient |
   |   | hadoop.ozone.client.rpc.TestSecureOzoneRpcClient |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=18.09.8 Server=18.09.8 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1071/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1071 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux eb6c64493a2f 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 3dc256e |
   | Default Java | 1.8.0_212 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1071/3/artifact/out/patch-unit-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1071/3/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1071/3/testReport/ |
   | Max. process+thread count | 5340 (vs. ulimit of 5500) |
   | modules | C: hadoop-ozone/integration-test U: 
hadoop-ozone/integration-test |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1071/3/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 278771)
Time Spent: 1.5h  (was: 1h 20m)

> TestWatchForCommit tests are flaky
> --
>
> Key: HDDS-1779
> URL: https://issues.apache.org/jira/browse/HDDS-1779
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant 

[jira] [Work logged] (HDDS-1811) Prometheus metrics are broken for datanodes due to an invalid metric

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1811?focusedWorklogId=278766=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278766
 ]

ASF GitHub Bot logged work on HDDS-1811:


Author: ASF GitHub Bot
Created on: 18/Jul/19 06:41
Start Date: 18/Jul/19 06:41
Worklog Time Spent: 10m 
  Work Description: adoroszlai commented on issue #1118: HDDS-1811. 
Prometheus metrics are broken
URL: https://github.com/apache/hadoop/pull/1118#issuecomment-512686331
 
 
   /label ozone
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 278766)
Time Spent: 20m  (was: 10m)

> Prometheus metrics are broken for datanodes due to an invalid metric
> 
>
> Key: HDDS-1811
> URL: https://issues.apache.org/jira/browse/HDDS-1811
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Doroszlai, Attila
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Datanodes can't be monitored with prometheus any more:
> {code}
> level=warn ts=2019-07-16T16:29:55.876Z caller=scrape.go:937 component="scrape 
> manager" scrape_pool=pods target=http://192.168.69.76:9882/prom msg="append 
> failed" err="invalid metric type 
> \"apache.hadoop.ozone.container.common.transport.server.ratis._csm_metrics_delete_container_avg_time
>  gauge\""
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1811) Prometheus metrics are broken for datanodes due to an invalid metric

2019-07-18 Thread Doroszlai, Attila (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doroszlai, Attila updated HDDS-1811:

Status: Patch Available  (was: In Progress)

> Prometheus metrics are broken for datanodes due to an invalid metric
> 
>
> Key: HDDS-1811
> URL: https://issues.apache.org/jira/browse/HDDS-1811
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Doroszlai, Attila
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Datanodes can't be monitored with prometheus any more:
> {code}
> level=warn ts=2019-07-16T16:29:55.876Z caller=scrape.go:937 component="scrape 
> manager" scrape_pool=pods target=http://192.168.69.76:9882/prom msg="append 
> failed" err="invalid metric type 
> \"apache.hadoop.ozone.container.common.transport.server.ratis._csm_metrics_delete_container_avg_time
>  gauge\""
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1811) Prometheus metrics are broken for datanodes due to an invalid metric

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1811?focusedWorklogId=278765=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278765
 ]

ASF GitHub Bot logged work on HDDS-1811:


Author: ASF GitHub Bot
Created on: 18/Jul/19 06:40
Start Date: 18/Jul/19 06:40
Worklog Time Spent: 10m 
  Work Description: adoroszlai commented on pull request #1118: HDDS-1811. 
Prometheus metrics are broken
URL: https://github.com/apache/hadoop/pull/1118
 
 
   ## What changes were proposed in this pull request?
   
   Fix invalid metric type errors:
   
   ```
   target=http://192.168.69.76:9882/prom err="invalid metric type 
\"apache.hadoop.ozone.container.common.transport.server.ratis._csm_metrics_delete_container_avg_time
 gauge\""
   ```
   
   and
   
   ```
   target=http://scm:9876/prom err="invalid metric type 
\"_rati_s-_thre_e-d7116831-ac55-4bf2-a259-d85cfba0572d counter\""
   ```
   
1. datanode: avoid `.` in record name by using simple class name
2. SCM: replace `-` with `_`.  Also properly convert `ALL_CAPS` names, eg. 
`RATIS_THREE` to `ratis_three` instead of `_rati_s-_thre_e`.
   
   https://issues.apache.org/jira/browse/HDDS-1811
   
   ## How was this patch tested?
   
   Updated unit test.
   
   Checked metrics in `ozoneperf` pseudo-cluster.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 278765)
Time Spent: 10m
Remaining Estimate: 0h

> Prometheus metrics are broken for datanodes due to an invalid metric
> 
>
> Key: HDDS-1811
> URL: https://issues.apache.org/jira/browse/HDDS-1811
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Doroszlai, Attila
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Datanodes can't be monitored with prometheus any more:
> {code}
> level=warn ts=2019-07-16T16:29:55.876Z caller=scrape.go:937 component="scrape 
> manager" scrape_pool=pods target=http://192.168.69.76:9882/prom msg="append 
> failed" err="invalid metric type 
> \"apache.hadoop.ozone.container.common.transport.server.ratis._csm_metrics_delete_container_avg_time
>  gauge\""
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1811) Prometheus metrics are broken for datanodes due to an invalid metric

2019-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-1811:
-
Labels: pull-request-available  (was: )

> Prometheus metrics are broken for datanodes due to an invalid metric
> 
>
> Key: HDDS-1811
> URL: https://issues.apache.org/jira/browse/HDDS-1811
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Doroszlai, Attila
>Priority: Blocker
>  Labels: pull-request-available
>
> Datanodes can't be monitored with prometheus any more:
> {code}
> level=warn ts=2019-07-16T16:29:55.876Z caller=scrape.go:937 component="scrape 
> manager" scrape_pool=pods target=http://192.168.69.76:9882/prom msg="append 
> failed" err="invalid metric type 
> \"apache.hadoop.ozone.container.common.transport.server.ratis._csm_metrics_delete_container_avg_time
>  gauge\""
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14257) NPE when given the Invalid path to create target dir

2019-07-18 Thread hemanthboyina (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-14257:
-
Attachment: HDFS-14257.002.patch

> NPE when given the Invalid path to create target dir
> 
>
> Key: HDFS-14257
> URL: https://issues.apache.org/jira/browse/HDFS-14257
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Harshakiran Reddy
>Assignee: hemanthboyina
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-14257.001.patch, HDFS-14257.002.patch, 
> HDFS-14257.patch
>
>
> bin> ./hdfs dfs -mkdir hdfs://{color:red}hacluster2 /hacluster1{color}dest2/
> {noformat}
> -mkdir: Fatal internal error
> java.lang.NullPointerException
> at 
> org.apache.hadoop.fs.FileSystem.fixRelativePart(FileSystem.java:2714)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.fixRelativePart(DistributedFileSystem.java:3229)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1618)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1742)
> at 
> org.apache.hadoop.fs.shell.Mkdir.processNonexistentPath(Mkdir.java:74)
> at 
> org.apache.hadoop.fs.shell.Command.processArgument(Command.java:287)
> at 
> org.apache.hadoop.fs.shell.Command.processArguments(Command.java:269)
> at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:121)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
> bin>
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14646) Standby NameNode should terminate the FsImage put process immediately if the peer NN is not in the appropriate state to receive an image.

2019-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887659#comment-16887659
 ] 

Hadoop QA commented on HDFS-14646:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 47s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 71 unchanged - 1 fixed = 71 total (was 72) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  5s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}109m 19s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}170m 21s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestCheckpoint |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14646 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12975116/HDFS-14646.000.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b94518d832cd 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 
22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 73e6ffc |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27249/artifact/out/whitespace-eol.txt
 |
| unit | 

[jira] [Commented] (HDFS-14257) NPE when given the Invalid path to create target dir

2019-07-18 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887656#comment-16887656
 ] 

Surendra Singh Lilhore commented on HDFS-14257:
---

Thanks [~hemanthboyina] for patch.

Changes LGTM.

Pls can you update the patch. I am not able to apply on trunk.

> NPE when given the Invalid path to create target dir
> 
>
> Key: HDFS-14257
> URL: https://issues.apache.org/jira/browse/HDFS-14257
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Harshakiran Reddy
>Assignee: hemanthboyina
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-14257.001.patch, HDFS-14257.patch
>
>
> bin> ./hdfs dfs -mkdir hdfs://{color:red}hacluster2 /hacluster1{color}dest2/
> {noformat}
> -mkdir: Fatal internal error
> java.lang.NullPointerException
> at 
> org.apache.hadoop.fs.FileSystem.fixRelativePart(FileSystem.java:2714)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.fixRelativePart(DistributedFileSystem.java:3229)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1618)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1742)
> at 
> org.apache.hadoop.fs.shell.Mkdir.processNonexistentPath(Mkdir.java:74)
> at 
> org.apache.hadoop.fs.shell.Command.processArgument(Command.java:287)
> at 
> org.apache.hadoop.fs.shell.Command.processArguments(Command.java:269)
> at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:121)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
> bin>
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



<    1   2   3