[jira] [Created] (HDDS-2647) Ozone DataNode does not set raft.server.log.corruption.policy to the RaftServer implementation it uses

2019-11-28 Thread Istvan Fajth (Jira)
Istvan Fajth created HDDS-2647:
--

 Summary: Ozone DataNode does not set 
raft.server.log.corruption.policy to the RaftServer implementation it uses
 Key: HDDS-2647
 URL: https://issues.apache.org/jira/browse/HDDS-2647
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Datanode
Reporter: Istvan Fajth


In the XceiverServerRatis class which is used by the DataNode as well to create 
the RaftServer implementation that is used, there is a method called 
newRaftProperties() which is there to set up the RaftProperties object 
specified for the RaftServer it starts.

This method is pretty hard to keep in sync with all the ratis properties, and 
due to an issue where I was turned to RATIS-677 which introduced a new 
configuration, I was not able to set this new property via the DataNode's 
ozone-site.xml, as it was not forwarded to the Ratis server.

On the long run we would need a better implementation that does not need tuning 
and follow up for every new Ratis property, however at the moment as a wuick 
fix we can just provide the property. Depending on the implementor, if we go 
with the easy way, then please create a new JIRA for a better solution after 
finishing this one. Also if I am wrong, and Ratis properties can be defined for 
the DN properly elsewhere, please let me know.

As OM is also using Ratis in HA configuration, this one should be checked there 
as well, however this one is not really important until RATIS-762 is fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 commented on a change in pull request #276: HDDS-2637. Handle LeaderNot ready exception in OzoneManager StateMachine and upgrade ratis to latest version.

2019-11-28 Thread GitBox
lokeshj1703 commented on a change in pull request #276: HDDS-2637. Handle 
LeaderNot ready exception in OzoneManager StateMachine and upgrade ratis to 
latest version.
URL: https://github.com/apache/hadoop-ozone/pull/276#discussion_r351872153
 
 

 ##
 File path: 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocolPB/OzoneManagerProtocolClientSideTranslatorPB.java
 ##
 @@ -370,13 +396,16 @@ private OMResponse submitRequest(OMRequest omRequest)
 
   return omResponse;
 } catch (ServiceException e) {
-//  throw ProtobufHelper.getRemoteException(e);
   NotLeaderException notLeaderException = getNotLeaderException(e);
   if (notLeaderException == null) {
 throw ProtobufHelper.getRemoteException(e);
 
 Review comment:
   If notLeaderException is null, we will always throw exception without 
checking the other exception types.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on issue #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available

2019-11-28 Thread GitBox
elek commented on issue #282: HDDS-2646. Start acceptance tests only if at 
least one THREE pipeline is available
URL: https://github.com/apache/hadoop-ozone/pull/282#issuecomment-559548313
 
 
   @ChenSammi You are more experienced with this area. Can you please review 
this approach /  patch? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek opened a new pull request #282: Hdds 2646

2019-11-28 Thread GitBox
elek opened a new pull request #282: Hdds 2646
URL: https://github.com/apache/hadoop-ozone/pull/282
 
 
   ## What changes were proposed in this pull request?
   
   After [HDDS-2034](https://issues.apache.org/jira/browse/HDDS-2034) (or even 
before?) pipeline creation (or the status transition from ALLOCATE to OPEN) 
requires at least one pipeline report from all of the datanodes. Which means 
that the cluster might not be usable even if it's out from the safe mode AND 
there are at least three datanodes.
   
   It makes all the acceptance tests unstable.
   
   For example in 
[this](https://github.com/apache/hadoop-ozone/pull/263/checks?check_run_id=324489319)
 run.
   
   ```
   scm_1 | 2019-11-28 11:22:54,401 INFO pipeline.RatisPipelineProvider: 
Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to 
datanode 548f146f-2166-440a-b9f1-83086591ae26
   scm_1 | 2019-11-28 11:22:54,402 INFO pipeline.RatisPipelineProvider: 
Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to 
datanode dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c
   scm_1 | 2019-11-28 11:22:54,404 INFO pipeline.RatisPipelineProvider: 
Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to 
datanode 47dbb8e4-bbde-4164-a798-e47e8c696fb5
   scm_1 | 2019-11-28 11:22:54,405 INFO pipeline.PipelineStateManager: 
Created pipeline Pipeline[ Id: 8dc4aeb6-5ae2-46a0-948d-287c97dd81fb, Nodes: 
548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}47dbb8e4-bbde-4164-a798-e47e8c696fb5{ip: 172.24.0.2, host: 
ozoneperf_datanode_2.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}, Type:RATIS, Factor:THREE, State:ALLOCATED]
   scm_1 | 2019-11-28 11:22:56,975 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}
   scm_1 | 2019-11-28 11:22:58,018 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}
   scm_1 | 2019-11-28 11:23:01,871 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}
   scm_1 | 2019-11-28 11:23:02,817 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}
   scm_1 | 2019-11-28 11:23:02,847 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null} 
   ```
   
   As you can see the pipeline is created but the the cluster is not usable as 
it's not yet reporter back by datanode_2:
   
   ```
   scm_1 | 2019-11-28 11:23:13,879 WARN block.BlockManagerImpl: 
Pipeline creation failed for type:RATIS factor:THREE. Retrying get pipelines c
   all once.
   scm_1 | 
org.apache.hadoop.hdds.scm.pipeline.InsufficientDatanodesException: Cannot 
create pipeline of factor 3 using 0 nodes.
   ```
   
The quick fix is to configure all the compose clusters to wait until (at 
least) one pipeline is available. This can be done by adjusting the number of 
the required datanodes:
   
   ```
   // We only care about THREE replica pipeline
   int minHealthyPipelines = minDatanodes /
   HddsProtos.ReplicationFactor.THREE_VALUE; 
   ```
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2646
   
   ## How was this patch tested?
   
   If something is wrong, acceptance tests are failing. We need green run from 
the CI.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, 

[jira] [Updated] (HDDS-2646) Start acceptance tests only if at least one THREE pipeline is available

2019-11-28 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated HDDS-2646:
--
Priority: Blocker  (was: Major)

> Start acceptance tests only if at least one THREE pipeline is available
> ---
>
> Key: HDDS-2646
> URL: https://issues.apache.org/jira/browse/HDDS-2646
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Marton Elek
>Priority: Blocker
> Attachments: docker-ozoneperf-ozoneperf-basic-scm.log
>
>
> After HDDS-2034 (or even before?) pipeline creation (or the status transition 
> from ALLOCATE to OPEN) requires at least one pipeline report from all of the 
> datanodes. Which means that the cluster might not be usable even if it's out 
> from the safe mode AND there are at least three datanodes.
> It makes all the acceptance tests unstable.
> For example in 
> [this|https://github.com/apache/hadoop-ozone/pull/263/checks?check_run_id=324489319]
>  run.
> {code:java}
> scm_1 | 2019-11-28 11:22:54,401 INFO pipeline.RatisPipelineProvider: 
> Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command 
> to datanode 548f146f-2166-440a-b9f1-83086591ae26
> scm_1 | 2019-11-28 11:22:54,402 INFO pipeline.RatisPipelineProvider: 
> Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command 
> to datanode dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c
> scm_1 | 2019-11-28 11:22:54,404 INFO pipeline.RatisPipelineProvider: 
> Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command 
> to datanode 47dbb8e4-bbde-4164-a798-e47e8c696fb5
> scm_1 | 2019-11-28 11:22:54,405 INFO pipeline.PipelineStateManager: 
> Created pipeline Pipeline[ Id: 8dc4aeb6-5ae2-46a0-948d-287c97dd81fb, Nodes: 
> 548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
> ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
> ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}47dbb8e4-bbde-4164-a798-e47e8c696fb5{ip: 172.24.0.2, host: 
> ozoneperf_datanode_2.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}, Type:RATIS, Factor:THREE, State:ALLOCATED]
> scm_1 | 2019-11-28 11:22:56,975 INFO pipeline.PipelineReportHandler: 
> Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
> 548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
> ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}
> scm_1 | 2019-11-28 11:22:58,018 INFO pipeline.PipelineReportHandler: 
> Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
> dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
> ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}
> scm_1 | 2019-11-28 11:23:01,871 INFO pipeline.PipelineReportHandler: 
> Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
> 548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
> ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}
> scm_1 | 2019-11-28 11:23:02,817 INFO pipeline.PipelineReportHandler: 
> Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
> 548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
> ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}
> scm_1 | 2019-11-28 11:23:02,847 INFO pipeline.PipelineReportHandler: 
> Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
> dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
> ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null} {code}
> As you can see the pipeline is created but the the cluster is not usable as 
> it's not yet reporter back by datanode_2:
> {code:java}
> scm_1 | 2019-11-28 11:23:13,879 WARN block.BlockManagerImpl: Pipeline 
> creation failed for type:RATIS factor:THREE. Retrying get pipelines c
> all once.
> scm_1 | 
> org.apache.hadoop.hdds.scm.pipeline.InsufficientDatanodesException: Cannot 
> create pipeline of factor 3 using 0 nodes.{code}
>  The quick fix is to configure all the compose clusters to wait until one 
> pipeline is available. This can be done by adjusting the number of the 
> required datanodes:
> {code:java}
> // We only care about THREE replica pipeline
> int minHealthyPipelines = minDatanodes /
> HddsProtos.ReplicationFactor.THREE_VALUE; {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To 

[jira] [Updated] (HDDS-2646) Start acceptance tests only if at least one THREE pipeline is available

2019-11-28 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated HDDS-2646:
--
Attachment: docker-ozoneperf-ozoneperf-basic-scm.log

> Start acceptance tests only if at least one THREE pipeline is available
> ---
>
> Key: HDDS-2646
> URL: https://issues.apache.org/jira/browse/HDDS-2646
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Blocker
> Attachments: docker-ozoneperf-ozoneperf-basic-scm.log
>
>
> After HDDS-2034 (or even before?) pipeline creation (or the status transition 
> from ALLOCATE to OPEN) requires at least one pipeline report from all of the 
> datanodes. Which means that the cluster might not be usable even if it's out 
> from the safe mode AND there are at least three datanodes.
> It makes all the acceptance tests unstable.
> For example in 
> [this|https://github.com/apache/hadoop-ozone/pull/263/checks?check_run_id=324489319]
>  run.
> {code:java}
> scm_1 | 2019-11-28 11:22:54,401 INFO pipeline.RatisPipelineProvider: 
> Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command 
> to datanode 548f146f-2166-440a-b9f1-83086591ae26
> scm_1 | 2019-11-28 11:22:54,402 INFO pipeline.RatisPipelineProvider: 
> Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command 
> to datanode dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c
> scm_1 | 2019-11-28 11:22:54,404 INFO pipeline.RatisPipelineProvider: 
> Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command 
> to datanode 47dbb8e4-bbde-4164-a798-e47e8c696fb5
> scm_1 | 2019-11-28 11:22:54,405 INFO pipeline.PipelineStateManager: 
> Created pipeline Pipeline[ Id: 8dc4aeb6-5ae2-46a0-948d-287c97dd81fb, Nodes: 
> 548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
> ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
> ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}47dbb8e4-bbde-4164-a798-e47e8c696fb5{ip: 172.24.0.2, host: 
> ozoneperf_datanode_2.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}, Type:RATIS, Factor:THREE, State:ALLOCATED]
> scm_1 | 2019-11-28 11:22:56,975 INFO pipeline.PipelineReportHandler: 
> Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
> 548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
> ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}
> scm_1 | 2019-11-28 11:22:58,018 INFO pipeline.PipelineReportHandler: 
> Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
> dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
> ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}
> scm_1 | 2019-11-28 11:23:01,871 INFO pipeline.PipelineReportHandler: 
> Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
> 548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
> ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}
> scm_1 | 2019-11-28 11:23:02,817 INFO pipeline.PipelineReportHandler: 
> Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
> 548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
> ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null}
> scm_1 | 2019-11-28 11:23:02,847 INFO pipeline.PipelineReportHandler: 
> Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
> dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
> ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
> certSerialId: null} {code}
> As you can see the pipeline is created but the the cluster is not usable as 
> it's not yet reporter back by datanode_2:
> {code:java}
> scm_1 | 2019-11-28 11:23:13,879 WARN block.BlockManagerImpl: Pipeline 
> creation failed for type:RATIS factor:THREE. Retrying get pipelines c
> all once.
> scm_1 | 
> org.apache.hadoop.hdds.scm.pipeline.InsufficientDatanodesException: Cannot 
> create pipeline of factor 3 using 0 nodes.{code}
>  The quick fix is to configure all the compose clusters to wait until one 
> pipeline is available. This can be done by adjusting the number of the 
> required datanodes:
> {code:java}
> // We only care about THREE replica pipeline
> int minHealthyPipelines = minDatanodes /
> HddsProtos.ReplicationFactor.THREE_VALUE; {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HDDS-2646) Start acceptance tests only if at least one THREE pipeline is available

2019-11-28 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated HDDS-2646:
--
Description: 
After HDDS-2034 (or even before?) pipeline creation (or the status transition 
from ALLOCATE to OPEN) requires at least one pipeline report from all of the 
datanodes. Which means that the cluster might not be usable even if it's out 
from the safe mode AND there are at least three datanodes.

It makes all the acceptance tests unstable.

For example in 
[this|https://github.com/apache/hadoop-ozone/pull/263/checks?check_run_id=324489319]
 run.
{code:java}
scm_1 | 2019-11-28 11:22:54,401 INFO pipeline.RatisPipelineProvider: 
Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to 
datanode 548f146f-2166-440a-b9f1-83086591ae26
scm_1 | 2019-11-28 11:22:54,402 INFO pipeline.RatisPipelineProvider: 
Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to 
datanode dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c
scm_1 | 2019-11-28 11:22:54,404 INFO pipeline.RatisPipelineProvider: 
Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to 
datanode 47dbb8e4-bbde-4164-a798-e47e8c696fb5
scm_1 | 2019-11-28 11:22:54,405 INFO pipeline.PipelineStateManager: 
Created pipeline Pipeline[ Id: 8dc4aeb6-5ae2-46a0-948d-287c97dd81fb, Nodes: 
548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}47dbb8e4-bbde-4164-a798-e47e8c696fb5{ip: 172.24.0.2, host: 
ozoneperf_datanode_2.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}, Type:RATIS, Factor:THREE, State:ALLOCATED]
scm_1 | 2019-11-28 11:22:56,975 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}
scm_1 | 2019-11-28 11:22:58,018 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}
scm_1 | 2019-11-28 11:23:01,871 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}
scm_1 | 2019-11-28 11:23:02,817 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}
scm_1 | 2019-11-28 11:23:02,847 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null} {code}
As you can see the pipeline is created but the the cluster is not usable as 
it's not yet reporter back by datanode_2:
{code:java}
scm_1 | 2019-11-28 11:23:13,879 WARN block.BlockManagerImpl: Pipeline 
creation failed for type:RATIS factor:THREE. Retrying get pipelines c
all once.
scm_1 | 
org.apache.hadoop.hdds.scm.pipeline.InsufficientDatanodesException: Cannot 
create pipeline of factor 3 using 0 nodes.{code}
 The quick fix is to configure all the compose clusters to wait until one 
pipeline is available. This can be done by adjusting the number of the required 
datanodes:
{code:java}
// We only care about THREE replica pipeline
int minHealthyPipelines = minDatanodes /
HddsProtos.ReplicationFactor.THREE_VALUE; {code}
 

  was:
After HDDS-2034 (or even before?) pipeline creation (or the status transition 
from ALLOCATE to OPEN) requires at least one pipeline report from all of the 
datanodes. Which means that the cluster might not be usable even if it's out 
from the safe mode AND there are at least three datanodes.

It makes all the acceptance tests unstable.

For example in 
[this|https://github.com/apache/hadoop-ozone/pull/263/checks?check_run_id=324489319]
 run.
{code:java}

scm_1 | 2019-11-28 11:22:54,401 INFO pipeline.RatisPipelineProvider: 
Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to 
datanode 548f146f-2166-440a-b9f1-83086591ae26
scm_1 | 2019-11-28 11:22:54,402 INFO pipeline.RatisPipelineProvider: 
Send 

[jira] [Created] (HDDS-2646) Start acceptance tests only if at least one THREE pipeline is available

2019-11-28 Thread Marton Elek (Jira)
Marton Elek created HDDS-2646:
-

 Summary: Start acceptance tests only if at least one THREE 
pipeline is available
 Key: HDDS-2646
 URL: https://issues.apache.org/jira/browse/HDDS-2646
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Marton Elek


After HDDS-2034 (or even before?) pipeline creation (or the status transition 
from ALLOCATE to OPEN) requires at least one pipeline report from all of the 
datanodes. Which means that the cluster might not be usable even if it's out 
from the safe mode AND there are at least three datanodes.

It makes all the acceptance tests unstable.

For example in 
[this|https://github.com/apache/hadoop-ozone/pull/263/checks?check_run_id=324489319]
 run.
{code:java}

scm_1 | 2019-11-28 11:22:54,401 INFO pipeline.RatisPipelineProvider: 
Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to 
datanode 548f146f-2166-440a-b9f1-83086591ae26
scm_1 | 2019-11-28 11:22:54,402 INFO pipeline.RatisPipelineProvider: 
Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to 
datanode dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c
scm_1 | 2019-11-28 11:22:54,404 INFO pipeline.RatisPipelineProvider: 
Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to 
datanode 47dbb8e4-bbde-4164-a798-e47e8c696fb5
scm_1 | 2019-11-28 11:22:54,405 INFO pipeline.PipelineStateManager: 
Created pipeline Pipeline[ Id: 8dc4aeb6-5ae2-46a0-948d-287c97dd81fb, Nodes: 
548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}47dbb8e4-bbde-4164-a798-e47e8c696fb5{ip: 172.24.0.2, host: 
ozoneperf_datanode_2.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}, Type:RATIS, Factor:THREE, State:ALLOCATED]
scm_1 | 2019-11-28 11:22:56,975 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}
scm_1 | 2019-11-28 11:22:58,018 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}
scm_1 | 2019-11-28 11:23:01,871 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}
scm_1 | 2019-11-28 11:23:02,817 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: 
ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null}
scm_1 | 2019-11-28 11:23:02,847 INFO pipeline.PipelineReportHandler: 
Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 
dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: 
ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, 
certSerialId: null} {code}
As you can see the pipeline is created but the 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2645) Refactor MiniOzoneChaosCluster to a different package to add filesystem tests

2019-11-28 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created HDDS-2645:
---

 Summary: Refactor MiniOzoneChaosCluster to a different package to 
add filesystem tests
 Key: HDDS-2645
 URL: https://issues.apache.org/jira/browse/HDDS-2645
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: test
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


Refactor MiniOzoneChaosCluster  to fault-injection-tests. Also add a dependency 
to hadoop-ozone-filesystem to add filesystem tests later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2644) TestTableCacheImpl#testPartialTableCacheWithOverrideAndDelete fails intermittently

2019-11-28 Thread Lokesh Jain (Jira)
Lokesh Jain created HDDS-2644:
-

 Summary: 
TestTableCacheImpl#testPartialTableCacheWithOverrideAndDelete fails 
intermittently
 Key: HDDS-2644
 URL: https://issues.apache.org/jira/browse/HDDS-2644
 Project: Hadoop Distributed Data Store
  Issue Type: Test
Reporter: Lokesh Jain



{code:java}
[ERROR] Tests run: 10, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.87 s 
<<< FAILURE! - in org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl
[ERROR] 
testPartialTableCacheWithOverrideAndDelete[0](org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl)
  Time elapsed: 0.044 s  <<< FAILURE!
java.lang.AssertionError: expected:<2> but was:<6>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl.testPartialTableCacheWithOverrideAndDelete(TestTableCacheImpl.java:308)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runners.Suite.runChild(Suite.java:127)
at org.junit.runners.Suite.runChild(Suite.java:26)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)

{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDDS-2640) Add leaderID information in pipeline list subcommand

2019-11-28 Thread Nilotpal Nandi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nilotpal Nandi reopened HDDS-2640:
--

> Add leaderID information in pipeline list subcommand
> 
>
> Key: HDDS-2640
> URL: https://issues.apache.org/jira/browse/HDDS-2640
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM Client
>Reporter: Nilotpal Nandi
>Assignee: Nilotpal Nandi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Need to add leaderID information in listPipeline subcommand.
> i.e,
> ozone scmcli pipeline list
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] nilotpalnandi opened a new pull request #281: HDDS-2640 Add leaderID information in pipeline list subcommand

2019-11-28 Thread GitBox
nilotpalnandi opened a new pull request #281: HDDS-2640 Add leaderID 
information in pipeline list subcommand
URL: https://github.com/apache/hadoop-ozone/pull/281
 
 
   ## What changes were proposed in this pull request?
   
   scmcli pipeline list command does not display the leaderID information for 
each pipeline.
   This change will include the leaderID information along with other details.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2640
   
   ## How was this patch tested?
   
   Applied the patch and rebuilt ozone and then tested it by creating docker 
cluster using docker-compose
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] adoroszlai commented on issue #279: HDDS-2640 Add leaderID information in pipeline list subcommand

2019-11-28 Thread GitBox
adoroszlai commented on issue #279: HDDS-2640 Add leaderID information in 
pipeline list subcommand
URL: https://github.com/apache/hadoop-ozone/pull/279#issuecomment-559508963
 
 
   > We need to remove this dependency OR always fix (or rerun) the unit test.
   
   +1 for removing the dependency: it would also let acceptance tests start ~20 
minutes earlier, reducing overall feedback time by 25% (80 -> 60 minutes).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 commented on issue #279: HDDS-2640 Add leaderID information in pipeline list subcommand

2019-11-28 Thread GitBox
lokeshj1703 commented on issue #279: HDDS-2640 Add leaderID information in 
pipeline list subcommand
URL: https://github.com/apache/hadoop-ozone/pull/279#issuecomment-559508597
 
 
   @elek Yeah I agree. I think removing the dependency between acceptance test 
and unit would be great because there are a lot of tests which fail 
intermittently.
   The post commit build failed with failure in TestTableCacheImpl.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek edited a comment on issue #279: HDDS-2640 Add leaderID information in pipeline list subcommand

2019-11-28 Thread GitBox
elek edited a comment on issue #279: HDDS-2640 Add leaderID information in 
pipeline list subcommand
URL: https://github.com/apache/hadoop-ozone/pull/279#issuecomment-559504005
 
 
   No problem, (and don't need to revert). I just realized how small this patch 
after I wrote the comment, so I understand that it's very low chance to 
introduce any problem (unless  one acceptance test checks the output of the 
pipeline command).
   
   It was more like a FYI:
   
   Master become unstable again and having reports for all the checks would 
help to understand the root of the problems. 
   
   The problem is that we have a strong dependency between acceptance and unit 
and if unit is failed (even if because and intermittent error) the acceptance 
tests are not executed.
   
   We need to remove this dependency OR always fix (or rerun) the unit test.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on issue #279: HDDS-2640 Add leaderID information in pipeline list subcommand

2019-11-28 Thread GitBox
elek commented on issue #279: HDDS-2640 Add leaderID information in pipeline 
list subcommand
URL: https://github.com/apache/hadoop-ozone/pull/279#issuecomment-559504005
 
 
   No problem, (and don't need to revert). I just realized how small this patch 
after I wrote the comment, so I understand that it's very high chance that it's 
not problematic (unless  one acceptance test checks the output).
   
   Master become unstable again and having reports for all the checks would 
help to understand the root of the problems. 
   
   The problem is that we have a strong dependency between acceptance and unit 
and if unit is failed (even if because and intermittent error) the acceptance 
tests are not executed.
   
   We need to remove this dependency OR always fix (or rerun) the unit test.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 merged pull request #280: Revert "HDDS-2640 Add leaderID information in pipeline list subcommand"

2019-11-28 Thread GitBox
lokeshj1703 merged pull request #280: Revert "HDDS-2640 Add leaderID 
information in pipeline list subcommand"
URL: https://github.com/apache/hadoop-ozone/pull/280
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 opened a new pull request #280: Revert "HDDS-2640 Add leaderID information in pipeline list subcommand"

2019-11-28 Thread GitBox
lokeshj1703 opened a new pull request #280: Revert "HDDS-2640 Add leaderID 
information in pipeline list subcommand"
URL: https://github.com/apache/hadoop-ozone/pull/280
 
 
   Reverts apache/hadoop-ozone#279


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2642) Expose decommission / maintenance metrics via JMX

2019-11-28 Thread Stephen O'Donnell (Jira)
Stephen O'Donnell created HDDS-2642:
---

 Summary: Expose decommission / maintenance metrics via JMX
 Key: HDDS-2642
 URL: https://issues.apache.org/jira/browse/HDDS-2642
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: SCM
Affects Versions: 0.5.0
Reporter: Stephen O'Donnell


As nodes transition through the decommission and maintenance workflow, we 
should expose the hosts going through admin via JMX, along with possibly:

1. The stage of the process (close pipelines, replicate containers etc)
2. The number of sufficiently replicated, under replicated and unhealthy 
containers



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2641) Allow SCM webUI to show decommision and maintenance nodes

2019-11-28 Thread Stephen O'Donnell (Jira)
Stephen O'Donnell created HDDS-2641:
---

 Summary: Allow SCM webUI to show decommision and maintenance nodes
 Key: HDDS-2641
 URL: https://issues.apache.org/jira/browse/HDDS-2641
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: SCM
Affects Versions: 0.5.0
Reporter: Stephen O'Donnell


The SCM WebUI should show the current set of decommission and maintenance 
nodes, possibly including the number of containers each node is waiting to have 
replicated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on issue #279: HDDS-2640 Add leaderID information in pipeline list subcommand

2019-11-28 Thread GitBox
elek commented on issue #279: HDDS-2640 Add leaderID information in pipeline 
list subcommand
URL: https://github.com/apache/hadoop-ozone/pull/279#issuecomment-559489646
 
 
   > The test failure does not seem related.
   
   Please don't commit patches with failing unit test, even if they are 
unrelated. The acceptance tests are not executed for this patch because the 
unit tests are failed.
   
   If you think it's not related, please create a jira, copy the failure and 
the logs and `@Ignore` the test, but get a green build.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 commented on issue #279: HDDS-2640 Add leaderID information in pipeline list subcommand

2019-11-28 Thread GitBox
lokeshj1703 commented on issue #279: HDDS-2640 Add leaderID information in 
pipeline list subcommand
URL: https://github.com/apache/hadoop-ozone/pull/279#issuecomment-559487731
 
 
   The test failure does not seem related.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 commented on issue #279: HDDS-2640 Add leaderID information in pipeline list subcommand

2019-11-28 Thread GitBox
lokeshj1703 commented on issue #279: HDDS-2640 Add leaderID information in 
pipeline list subcommand
URL: https://github.com/apache/hadoop-ozone/pull/279#issuecomment-559488008
 
 
   @nilotpalnandi Thanks for the contribution! I have committed the PR to 
master branch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 merged pull request #279: HDDS-2640 Add leaderID information in pipeline list subcommand

2019-11-28 Thread GitBox
lokeshj1703 merged pull request #279: HDDS-2640 Add leaderID information in 
pipeline list subcommand
URL: https://github.com/apache/hadoop-ozone/pull/279
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] adoroszlai commented on issue #238: HDDS-2588. Consolidate compose environments

2019-11-28 Thread GitBox
adoroszlai commented on issue #238: HDDS-2588. Consolidate compose environments
URL: https://github.com/apache/hadoop-ozone/pull/238#issuecomment-559477880
 
 
   Thanks for the feedback @elek.
   
   > Can you please update the README.txt
   
   Sure, will do, but didn't want to write doc until the code is OK-ed. ;)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on issue #238: HDDS-2588. Consolidate compose environments

2019-11-28 Thread GitBox
elek commented on issue #238: HDDS-2588. Consolidate compose environments
URL: https://github.com/apache/hadoop-ozone/pull/238#issuecomment-559472233
 
 
   > I think (1) and (2) are addressed by the followup commit, which extracts 
monitoring and profiling into separate configs
   
   Thanks the update @adoroszlai This approach is very smart, but I have some 
fear how easy is to understand it. (One additional function of the compose 
folders to provide *simple* examples to use ozone.)
   
   But let's try out this approach. I am fine with it. 
   
   Can you please update the README.txt inside `compose/ozone` (currently it's 
the original ozoneperf readme, It can be simplified but we need to add 
information about the `COMPOSE_FILE=...` trick)?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2640) Add leaderID information in pipeline list subcommand

2019-11-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2640:
-
Labels: pull-request-available  (was: )

> Add leaderID information in pipeline list subcommand
> 
>
> Key: HDDS-2640
> URL: https://issues.apache.org/jira/browse/HDDS-2640
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM Client
>Reporter: Nilotpal Nandi
>Assignee: Nilotpal Nandi
>Priority: Major
>  Labels: pull-request-available
>
> Need to add leaderID information in listPipeline subcommand.
> i.e,
> ozone scmcli pipeline list
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] nilotpalnandi opened a new pull request #279: HDDS-2640 Add leaderID information in pipeline list subcommand

2019-11-28 Thread GitBox
nilotpalnandi opened a new pull request #279: HDDS-2640 Add leaderID 
information in pipeline list subcommand
URL: https://github.com/apache/hadoop-ozone/pull/279
 
 
   ## What changes were proposed in this pull request?
   
   (Please fill in changes proposed in this fix)
   
   ## What is the link to the Apache JIRA
   
   (Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HDDS-. Fix a typo in YYY.)
   
   Please replace this section with the link to the Apache JIRA)
   
   ## How was this patch tested?
   
   (Please explain how this patch was tested. Ex: unit tests, manual tests)
   (If this patch involves UI changes, please attach a screen-shot; otherwise, 
remove this)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2640) Add leaderID information in pipeline list subcommand

2019-11-28 Thread Nilotpal Nandi (Jira)
Nilotpal Nandi created HDDS-2640:


 Summary: Add leaderID information in pipeline list subcommand
 Key: HDDS-2640
 URL: https://issues.apache.org/jira/browse/HDDS-2640
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: SCM Client
Reporter: Nilotpal Nandi
Assignee: Nilotpal Nandi


Need to add leaderID information in listPipeline subcommand.

i.e,

ozone scmcli pipeline list

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek opened a new pull request #278: HDDS-2639. TestTableCacheImpl is flaky

2019-11-28 Thread GitBox
elek opened a new pull request #278: HDDS-2639. TestTableCacheImpl is flaky
URL: https://github.com/apache/hadoop-ozone/pull/278
 
 
   ## What changes were proposed in this pull request?
   
   Run(master): https://github.com/apache/hadoop-ozone/runs/324342299
   
   ```
   
---
   Test set: org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl
   
---
   Tests run: 10, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.955 s <<< 
FAILURE! - in org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl
   
testPartialTableCacheWithOverrideAndDelete[0](org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl)
  Time elapsed: 0.039 s  <<< FAILURE!
   java.lang.AssertionError: expected:<2> but was:<6>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl.testPartialTableCacheWithOverrideAndDelete(TestTableCacheImpl.java:308)
   ```

   ### How to reproduce it locally?
   
   Replace the last `tableCache.cleanup` call of 
`testPartialTableCacheWithOverrideAndDelete` to 
`System.out.println(tableCache.size())`.
   
   You will see that the cache size is `2` even before the cleanup therefore 
the next `GeneriTestUtils.waitFor` is useless (it doesn't guarantee that the 
cleanup is finished).
   
   ### Fix
   
   I propose to call the cleanup sync (instead of async) with using 
`TableCacheImpl` reference instead of the interface. It simplifies the test but 
still validates the behavior.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2639
   
   ## How was this patch tested?
   
   Problem an be reproduced locally as defined above. 
   
   Fix can be tested with executing the `TestTableCacheImpl` unit test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2639) TestTableCacheImpl is flaky

2019-11-28 Thread Marton Elek (Jira)
Marton Elek created HDDS-2639:
-

 Summary: TestTableCacheImpl is flaky
 Key: HDDS-2639
 URL: https://issues.apache.org/jira/browse/HDDS-2639
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Marton Elek


Run(master): [https://github.com/apache/hadoop-ozone/runs/324342299]

 
{code:java}
---
Test set: org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl
---
Tests run: 10, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.955 s <<< 
FAILURE! - in org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl
testPartialTableCacheWithOverrideAndDelete[0](org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl)
  Time elapsed: 0.039 s  <<< FAILURE!
java.lang.AssertionError: expected:<2> but was:<6>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl.testPartialTableCacheWithOverrideAndDelete(TestTableCacheImpl.java:308)

 {code}
*How to reproduce it locally?*

Replace the last tableCache.evict call of 
testPartialTableCacheWithOverrideAndDelete to 
System.out.println(tableCache.size()).

You will see that the cache size is 2 even before the cleanup therefore the 
next GeneriTestUtils.waitFor is useless (it doesn't guarantee that the cleanup 
is finished).

*Fix:*

I propose to call the cleanup sync with using the Impl class instead of the 
interface. It simplifies the test but still validates the behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek closed pull request #249: Reopen HDDS-2034 Async RATIS pipeline creation and destroy through heartbeat commands

2019-11-28 Thread GitBox
elek closed pull request #249: Reopen HDDS-2034 Async RATIS pipeline creation 
and destroy through heartbeat commands
URL: https://github.com/apache/hadoop-ozone/pull/249
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2628) Make AuditMessage parameters strongly typed

2019-11-28 Thread Attila Doroszlai (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HDDS-2628:
---
Status: Patch Available  (was: In Progress)

> Make AuditMessage parameters strongly typed
> ---
>
> Key: HDDS-2628
> URL: https://issues.apache.org/jira/browse/HDDS-2628
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Improve type safety in {{AuditMessage$Builder}} for methods {{forOperation}} 
> and {{withResult}} by using existing {{interface AuditAction}} and {{enum 
> AuditEventStatus}} respectively instead of Strings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2628) Make AuditMessage parameters strongly typed

2019-11-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2628:
-
Labels: pull-request-available  (was: )

> Make AuditMessage parameters strongly typed
> ---
>
> Key: HDDS-2628
> URL: https://issues.apache.org/jira/browse/HDDS-2628
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>  Labels: pull-request-available
>
> Improve type safety in {{AuditMessage$Builder}} for methods {{forOperation}} 
> and {{withResult}} by using existing {{interface AuditAction}} and {{enum 
> AuditEventStatus}} respectively instead of Strings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] adoroszlai opened a new pull request #277: HDDS-2628. Make AuditMessage parameters strongly typed

2019-11-28 Thread GitBox
adoroszlai opened a new pull request #277: HDDS-2628. Make AuditMessage 
parameters strongly typed
URL: https://github.com/apache/hadoop-ozone/pull/277
 
 
   ## What changes were proposed in this pull request?
   
   1. Improve type safety in `AuditMessage$Builder` for methods `forOperation` 
and `withResult` by using existing `interface AuditAction` and `enum 
AuditEventStatus` respectively instead of Strings.
   2. Use existing `Server.getRemoteAddress()` instead of 
`Server.getRemoteIp().getHostAddress()` with null check
   3. Define and use `getRemoteUserName()` along the same lines
   
   https://issues.apache.org/jira/browse/HDDS-2628
   
   ## How was this patch tested?
   
   Created keys using Freon, verified audit log entries.
   
   ```
   2019-11-27 20:48:16,206 | INFO  | SCMAudit | user=hadoop | ip=172.23.0.3 | 
op=ALLOCATE_BLOCK {owner=88982149-2c09-4cd7-8e38-fba8f23cff5e, size=268435456, 
type=RATIS, factor=ONE} | ret=SUCCESS |
   2019-11-27 20:48:45,879 | INFO  | SCMAudit | user=hadoop | ip=172.23.0.4 | 
op=SEND_HEARTBEAT {datanodeUUID=4e2ee488-6dc6-45f8-9f02-02f1a5cff554, 
command=[]} | ret=SUCCESS |
   ```
   
   ```
   2019-11-27 20:48:16,208 | INFO  | OMAudit | user=hadoop | ip=172.23.0.2 | 
op=ALLOCATE_KEY {volume=vol1, bucket=bucket1, key=OkoBsDwxuj/9, dataSize=10240, 
replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID {
 containerBlockID {
   containerID: 1
   localID: 103211840058556425
   ...
   ```
   
   https://github.com/adoroszlai/hadoop-ozone/runs/323724573


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #262: HDDS-2459. Change the ReplicationManager to consider decommission and maintenance states

2019-11-28 Thread GitBox
elek commented on a change in pull request #262: HDDS-2459. Change the 
ReplicationManager to consider decommission and maintenance states
URL: https://github.com/apache/hadoop-ozone/pull/262#discussion_r351642030
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
 ##
 @@ -97,6 +98,11 @@
*/
   private final LockManager lockManager;
 
+  /**
+   * Used to lookup the health of a nodes or the nodes operational state.
+   */
+  private final NodeManager nodeManager;
 
 Review comment:
   As far as I understood the proposal is to update the state of the containers 
by an other components based on the node state and use only the container state 
here (instead of checking the state by the node manager).
   
   I discussed it with @anuengineer. Let's go forward with this approach and 
later we can improve this part.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on issue #262: HDDS-2459. Change the ReplicationManager to consider decommission and maintenance states

2019-11-28 Thread GitBox
elek commented on issue #262: HDDS-2459. Change the ReplicationManager to 
consider decommission and maintenance states
URL: https://github.com/apache/hadoop-ozone/pull/262#issuecomment-559389693
 
 
   > I will change this to @ignore however I have not been able to find the 
cause of the problem
   
   Sure, just link the failing github actions unit test + download the logs and 
upload to the jira (if meaningful, in case of timeout can be empty). I am not 
interested about the real root cause, but we need a definition of the problem 
including assertion errors, exceptions and log output to check it later.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] smengcl commented on a change in pull request #137: HDDS-2455. Implement MiniOzoneHAClusterImpl#getOMLeader

2019-11-28 Thread GitBox
smengcl commented on a change in pull request #137: HDDS-2455. Implement 
MiniOzoneHAClusterImpl#getOMLeader
URL: https://github.com/apache/hadoop-ozone/pull/137#discussion_r351641182
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestMiniOzoneHACluster.java
 ##
 @@ -0,0 +1,108 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone;
+
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.ozone.client.ObjectStore;
+import org.apache.hadoop.ozone.client.OzoneClientFactory;
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.ExpectedException;
+import org.junit.rules.Timeout;
+
+import java.io.IOException;
+import java.util.UUID;
+import java.util.concurrent.TimeoutException;
+
+import static org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_ACL_ENABLED;
+import static 
org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_ADMINISTRATORS_WILDCARD;
+import static 
org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_OPEN_KEY_EXPIRE_THRESHOLD_SECONDS;
+
+/**
+ * This class tests MiniOzoneHAClusterImpl.
+ */
+public class TestMiniOzoneHACluster {
+
+  private MiniOzoneHAClusterImpl cluster = null;
+  private ObjectStore objectStore;
+  private OzoneConfiguration conf;
+  private String clusterId;
+  private String scmId;
+  private String omServiceId;
+  private int numOfOMs = 3;
+
+  @Rule
+  public ExpectedException exception = ExpectedException.none();
+
+  @Rule
+  public Timeout timeout = new Timeout(300_000);
+
+  /**
+   * Create a MiniOzoneHAClusterImpl for testing.
+   *
+   * @throws IOException
+   */
+  @Before
+  public void init() throws Exception {
+conf = new OzoneConfiguration();
+clusterId = UUID.randomUUID().toString();
+scmId = UUID.randomUUID().toString();
+omServiceId = "omServiceId1";
+conf.setBoolean(OZONE_ACL_ENABLED, true);
+conf.set(OzoneConfigKeys.OZONE_ADMINISTRATORS,
+OZONE_ADMINISTRATORS_WILDCARD);
+conf.setInt(OZONE_OPEN_KEY_EXPIRE_THRESHOLD_SECONDS, 2);
+cluster = (MiniOzoneHAClusterImpl) MiniOzoneCluster.newHABuilder(conf)
+.setClusterId(clusterId)
+.setScmId(scmId)
+.setOMServiceId(omServiceId)
+.setNumOfOzoneManagers(numOfOMs)
+.build();
+cluster.waitForClusterToBeReady();
+objectStore = OzoneClientFactory.getRpcClient(omServiceId, conf)
+.getObjectStore();
+  }
+
+  /**
+   * Shutdown MiniOzoneHAClusterImpl.
+   */
+  @After
+  public void shutdown() {
+if (cluster != null) {
+  cluster.shutdown();
+}
+  }
+
+  @Test
+  public void testGetOMLeader() throws InterruptedException, TimeoutException {
+// Wait for OM leader election to finish
+GenericTestUtils.waitFor(() -> cluster.getOMLeader() != null,
+100, 3);
 
 Review comment:
   Thanks @hanishakoneru for the comment.
   
   Note that assigning to `ozoneManager` which is outside the lambda expression 
requires it to be atomic.
   
   Just pushed a commit. Please take a look.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] smengcl commented on a change in pull request #137: HDDS-2455. Implement MiniOzoneHAClusterImpl#getOMLeader

2019-11-28 Thread GitBox
smengcl commented on a change in pull request #137: HDDS-2455. Implement 
MiniOzoneHAClusterImpl#getOMLeader
URL: https://github.com/apache/hadoop-ozone/pull/137#discussion_r351641182
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestMiniOzoneHACluster.java
 ##
 @@ -0,0 +1,108 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone;
+
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.ozone.client.ObjectStore;
+import org.apache.hadoop.ozone.client.OzoneClientFactory;
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.ExpectedException;
+import org.junit.rules.Timeout;
+
+import java.io.IOException;
+import java.util.UUID;
+import java.util.concurrent.TimeoutException;
+
+import static org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_ACL_ENABLED;
+import static 
org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_ADMINISTRATORS_WILDCARD;
+import static 
org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_OPEN_KEY_EXPIRE_THRESHOLD_SECONDS;
+
+/**
+ * This class tests MiniOzoneHAClusterImpl.
+ */
+public class TestMiniOzoneHACluster {
+
+  private MiniOzoneHAClusterImpl cluster = null;
+  private ObjectStore objectStore;
+  private OzoneConfiguration conf;
+  private String clusterId;
+  private String scmId;
+  private String omServiceId;
+  private int numOfOMs = 3;
+
+  @Rule
+  public ExpectedException exception = ExpectedException.none();
+
+  @Rule
+  public Timeout timeout = new Timeout(300_000);
+
+  /**
+   * Create a MiniOzoneHAClusterImpl for testing.
+   *
+   * @throws IOException
+   */
+  @Before
+  public void init() throws Exception {
+conf = new OzoneConfiguration();
+clusterId = UUID.randomUUID().toString();
+scmId = UUID.randomUUID().toString();
+omServiceId = "omServiceId1";
+conf.setBoolean(OZONE_ACL_ENABLED, true);
+conf.set(OzoneConfigKeys.OZONE_ADMINISTRATORS,
+OZONE_ADMINISTRATORS_WILDCARD);
+conf.setInt(OZONE_OPEN_KEY_EXPIRE_THRESHOLD_SECONDS, 2);
+cluster = (MiniOzoneHAClusterImpl) MiniOzoneCluster.newHABuilder(conf)
+.setClusterId(clusterId)
+.setScmId(scmId)
+.setOMServiceId(omServiceId)
+.setNumOfOzoneManagers(numOfOMs)
+.build();
+cluster.waitForClusterToBeReady();
+objectStore = OzoneClientFactory.getRpcClient(omServiceId, conf)
+.getObjectStore();
+  }
+
+  /**
+   * Shutdown MiniOzoneHAClusterImpl.
+   */
+  @After
+  public void shutdown() {
+if (cluster != null) {
+  cluster.shutdown();
+}
+  }
+
+  @Test
+  public void testGetOMLeader() throws InterruptedException, TimeoutException {
+// Wait for OM leader election to finish
+GenericTestUtils.waitFor(() -> cluster.getOMLeader() != null,
+100, 3);
 
 Review comment:
   Thanks @hanishakoneru for the comment.
   
   Note that assigning to `ozoneManager` outside the lambda expression requires 
it to be atomic.
   
   Just pushed a commit. Please take a look.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org