[jira] [Commented] (HDDS-629) Make ApplyTransaction calls in ContainerStateMachine idempotent

2018-10-15 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649823#comment-16649823
 ] 

Shashikant Banerjee commented on HDDS-629:
--

Thanks [~jnp], for the review. Patch v6 addresses the review comments as well 
the one checkstyle issue reported.

Test failures are not related to the patch.

> Make ApplyTransaction calls in ContainerStateMachine idempotent
> ---
>
> Key: HDDS-629
> URL: https://issues.apache.org/jira/browse/HDDS-629
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-629.000.patch, HDDS-629.001.patch, 
> HDDS-629.002.patch, HDDS-629.003.patch, HDDS-629.004.patch, 
> HDDS-629.005.patch, HDDS-629.006.patch
>
>
> When a Datanode restarts, it may lead up to a case where it can reapply 
> already applied Transactions when it joins the pipeline again . For this 
> requirement, all ApplyTransaction calls in Ratis need to be made idempotent



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-629) Make ApplyTransaction calls in ContainerStateMachine idempotent

2018-10-15 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-629:
-
Attachment: HDDS-629.006.patch

> Make ApplyTransaction calls in ContainerStateMachine idempotent
> ---
>
> Key: HDDS-629
> URL: https://issues.apache.org/jira/browse/HDDS-629
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-629.000.patch, HDDS-629.001.patch, 
> HDDS-629.002.patch, HDDS-629.003.patch, HDDS-629.004.patch, 
> HDDS-629.005.patch, HDDS-629.006.patch
>
>
> When a Datanode restarts, it may lead up to a case where it can reapply 
> already applied Transactions when it joins the pipeline again . For this 
> requirement, all ApplyTransaction calls in Ratis need to be made idempotent



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-629) Make ApplyTransaction calls in ContainerStateMachine idempotent

2018-10-14 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-629:
-
Attachment: HDDS-629.005.patch

> Make ApplyTransaction calls in ContainerStateMachine idempotent
> ---
>
> Key: HDDS-629
> URL: https://issues.apache.org/jira/browse/HDDS-629
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-629.000.patch, HDDS-629.001.patch, 
> HDDS-629.002.patch, HDDS-629.003.patch, HDDS-629.004.patch, HDDS-629.005.patch
>
>
> When a Datanode restarts, it may lead up to a case where it can reapply 
> already applied Transactions when it joins the pipeline again . For this 
> requirement, all ApplyTransaction calls in Ratis need to be made idempotent



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-629) Make ApplyTransaction calls in ContainerStateMachine idempotent

2018-10-14 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649719#comment-16649719
 ] 

Shashikant Banerjee commented on HDDS-629:
--

Thanks [~jnp], for the review. Patch v5 addresses the review comments. The test 
failures reported here are not related to the patch.

> Make ApplyTransaction calls in ContainerStateMachine idempotent
> ---
>
> Key: HDDS-629
> URL: https://issues.apache.org/jira/browse/HDDS-629
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-629.000.patch, HDDS-629.001.patch, 
> HDDS-629.002.patch, HDDS-629.003.patch, HDDS-629.004.patch, HDDS-629.005.patch
>
>
> When a Datanode restarts, it may lead up to a case where it can reapply 
> already applied Transactions when it joins the pipeline again . For this 
> requirement, all ApplyTransaction calls in Ratis need to be made idempotent



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-629) Make ApplyTransaction calls in ContainerStateMachine idempotent

2018-10-15 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-629:
-
Attachment: (was: HDDS-629.006.patch)

> Make ApplyTransaction calls in ContainerStateMachine idempotent
> ---
>
> Key: HDDS-629
> URL: https://issues.apache.org/jira/browse/HDDS-629
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-629.000.patch, HDDS-629.001.patch, 
> HDDS-629.002.patch, HDDS-629.003.patch, HDDS-629.004.patch, HDDS-629.005.patch
>
>
> When a Datanode restarts, it may lead up to a case where it can reapply 
> already applied Transactions when it joins the pipeline again . For this 
> requirement, all ApplyTransaction calls in Ratis need to be made idempotent



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-629) Make ApplyTransaction calls in ContainerStateMachine idempotent

2018-10-15 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-629:
-
Attachment: HDDS-629.006.patch

> Make ApplyTransaction calls in ContainerStateMachine idempotent
> ---
>
> Key: HDDS-629
> URL: https://issues.apache.org/jira/browse/HDDS-629
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-629.000.patch, HDDS-629.001.patch, 
> HDDS-629.002.patch, HDDS-629.003.patch, HDDS-629.004.patch, HDDS-629.005.patch
>
>
> When a Datanode restarts, it may lead up to a case where it can reapply 
> already applied Transactions when it joins the pipeline again . For this 
> requirement, all ApplyTransaction calls in Ratis need to be made idempotent



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-629) Make ApplyTransaction calls in ContainerStateMachine idempotent

2018-10-15 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-629:
-
Status: Open  (was: Patch Available)

> Make ApplyTransaction calls in ContainerStateMachine idempotent
> ---
>
> Key: HDDS-629
> URL: https://issues.apache.org/jira/browse/HDDS-629
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-629.000.patch, HDDS-629.001.patch, 
> HDDS-629.002.patch, HDDS-629.003.patch, HDDS-629.004.patch, 
> HDDS-629.005.patch, HDDS-629.006.patch
>
>
> When a Datanode restarts, it may lead up to a case where it can reapply 
> already applied Transactions when it joins the pipeline again . For this 
> requirement, all ApplyTransaction calls in Ratis need to be made idempotent



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-629) Make ApplyTransaction calls in ContainerStateMachine idempotent

2018-10-15 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-629:
-
Status: Patch Available  (was: Open)

> Make ApplyTransaction calls in ContainerStateMachine idempotent
> ---
>
> Key: HDDS-629
> URL: https://issues.apache.org/jira/browse/HDDS-629
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-629.000.patch, HDDS-629.001.patch, 
> HDDS-629.002.patch, HDDS-629.003.patch, HDDS-629.004.patch, 
> HDDS-629.005.patch, HDDS-629.006.patch
>
>
> When a Datanode restarts, it may lead up to a case where it can reapply 
> already applied Transactions when it joins the pipeline again . For this 
> requirement, all ApplyTransaction calls in Ratis need to be made idempotent



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-667) Fix TestOzoneFileInterfaces

2018-10-16 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651488#comment-16651488
 ] 

Shashikant Banerjee commented on HDDS-667:
--

Thanks [~msingh], for the patch. The patch looks good to me. +1

> Fix TestOzoneFileInterfaces
> ---
>
> Key: HDDS-667
> URL: https://issues.apache.org/jira/browse/HDDS-667
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: HDDS-667.001.patch
>
>
> The test is failing with the following exception.
> This test is failing after e13a38f4bc358666e64687636cf7b025bce83b46 (HDDS-629)
> {code}
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces
> [ERROR] Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 54.718 s <<< FAILURE! - in org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces
> [ERROR] 
> testOzFsReadWrite[1](org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces)  
> Time elapsed: 7.1 s  <<< ERROR!
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
>  Unable to find the block.
>   at 
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:429)
>   at 
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.getBlock(ContainerProtocolCalls.java:103)
>   at 
> org.apache.hadoop.ozone.client.io.ChunkGroupInputStream.getFromOmKeyInfo(ChunkGroupInputStream.java:290)
>   at 
> org.apache.hadoop.ozone.client.rpc.RpcClient.getKey(RpcClient.java:493)
>   at 
> org.apache.hadoop.ozone.client.OzoneBucket.readKey(OzoneBucket.java:272)
>   at 
> org.apache.hadoop.fs.ozone.OzoneFileSystem.open(OzoneFileSystem.java:173)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:899)
>   at 
> org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces.testOzFsReadWrite(TestOzoneFileInterfaces.java:175)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runners.Suite.runChild(Suite.java:127)
>   at org.junit.runners.Suite.runChild(Suite.java:26)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
>   at 
> 

[jira] [Commented] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-19 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656483#comment-16656483
 ] 

Shashikant Banerjee commented on HDDS-676:
--

Patch v4 addresses the javaodc, checkstyle and failed test cases.

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-19 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-676:
-
Attachment: HDDS-676.004.patch

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-18 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-676:
-
Attachment: HDDS-676.003.patch

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-18 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656286#comment-16656286
 ] 

Shashikant Banerjee commented on HDDS-676:
--

cleaned up ReadSmallFile  command handling related changes in Patch v3. 

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-18 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656270#comment-16656270
 ] 

Shashikant Banerjee commented on HDDS-676:
--

Thanks [~jnp], for the review comments.
{code:java}
What is the reason for changing clientCache key from PipelineId to string? 
{code}
By default, SCM always gives a Ratis pipeline for open containers and when a 
XceiverClient instance gets created, it always placed in the clinet cache based 
on the pipeline ID. Since, while doing a Read Op we always want to use a 
Standalone pipeline with the same pipelineId which SCM provides, the idea is 
get a XceiverClientGrpc instance with the same pipleineId, thus having the same 
set of datanodes which will be used by Ratis pipeline as well having the same 
pipleineID, Changing key in ClinetCahe from pipelineId to a String which is 
combination of pipelineId and type gives us the flexiblity to have create two 
different types of pipeline with the same pipelineID.
{code:java}
The changes in XceiverClientRatis are only for testing?
{code}
Yes, for now.
{code:java}
It is minor but I feel we should not make type mutable in the Pipeline class. 
We could clone the Pipeline object to change the type.
{code}
This will be addressed with HDDS-694.
{code:java}
In ContainerStateMachine changes don't look related to this Jira as they are 
about put-small-files. If yes, we should put them in a separate jira.
{code}
Opened HDDS-697 for the same.

Rest of the review comments are addressed in the patch along with the 
checkstyle fixes. Javadoc issues seem to be unrelated.

 

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-697) update the BCSID for PutSmallFile command

2018-10-18 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-697:


 Summary: update the BCSID for PutSmallFile command
 Key: HDDS-697
 URL: https://issues.apache.org/jira/browse/HDDS-697
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-18 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-676:
-
Attachment: HDDS-676.002.patch

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-705) OS3Exception resource name should be the actual resource name

2018-10-22 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658749#comment-16658749
 ] 

Shashikant Banerjee edited comment on HDDS-705 at 10/22/18 8:52 AM:


I think the patch which is committed to trunk has changes from HDDS-676 as 
well. This commit needs to get reverted I guess.


was (Author: shashikant):
I think the patch which is committed to trunk has changes from HDDS-676. This 
commit needs to get reverted I guess.

> OS3Exception resource name should be the actual resource name
> -
>
> Key: HDDS-705
> URL: https://issues.apache.org/jira/browse/HDDS-705
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-705.00.patch
>
>
> [https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html]
> {code:java}
>  
>  
>  NoSuchKey 
>  The resource you requested does not exist 
>  /mybucket/myfoto.jpg 
>  4442587FB7D0A2F9 
> {code}
>  
> Right now in the code we are print resource as "bucket" , "key" instead of 
> actual resource name.
>  
> Documentation shows key name with bucket, but actually when tried on AWS S3 
> endpoint it shows just key name, found this information when using mitmproxy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-705) OS3Exception resource name should be the actual resource name

2018-10-22 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658749#comment-16658749
 ] 

Shashikant Banerjee commented on HDDS-705:
--

I think the patch which is committed to trunk has changes from HDDS-676. This 
commit needs to get reverted I guess.

> OS3Exception resource name should be the actual resource name
> -
>
> Key: HDDS-705
> URL: https://issues.apache.org/jira/browse/HDDS-705
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-705.00.patch
>
>
> [https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html]
> {code:java}
>  
>  
>  NoSuchKey 
>  The resource you requested does not exist 
>  /mybucket/myfoto.jpg 
>  4442587FB7D0A2F9 
> {code}
>  
> Right now in the code we are print resource as "bucket" , "key" instead of 
> actual resource name.
>  
> Documentation shows key name with bucket, but actually when tried on AWS S3 
> endpoint it shows just key name, found this information when using mitmproxy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-22 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1665#comment-1665
 ] 

Shashikant Banerjee commented on HDDS-697:
--

Patch v1 depends on HDDS-708. Not submitting it for now.

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch
>
>
> Similar , to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-22 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-697:
-
Description: Similar to putBlock/GetBlock, putSmallFile transaction in 
Ratis needs to update the BCSID in the container db on datanode. getSmallFile 
should validate the bcsId while reading the block similar to getBlock.  (was: 
Similar , to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
update the BCSID in the container db on datanode. getSmallFile should validate 
the bcsId while reading the block similar to getBlock.)

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch
>
>
> Similar to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-708) Validate BCSID while reading blocks from containers in datanodes

2018-10-22 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659126#comment-16659126
 ] 

Shashikant Banerjee commented on HDDS-708:
--

 The test failures are not related to the patch.

> Validate BCSID while reading blocks from containers in datanodes
> 
>
> Key: HDDS-708
> URL: https://issues.apache.org/jira/browse/HDDS-708
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.3.0
>
> Attachments: HDDS-708.000.patch
>
>
> Ozone client while making a getBlock call during reading data , should read 
> the bcsId from OzoneManager for the block and the same needs to be validated 
> in Datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-22 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-676:
-
Attachment: HDDS-676-ozone-0.3.000.patch

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-676-ozone-0.3.000.patch, HDDS-676.001.patch, 
> HDDS-676.002.patch, HDDS-676.003.patch, HDDS-676.004.patch, 
> HDDS-676.005.patch, HDDS-676.006.patch, HDDS-676.007.patch, HDDS-676.008.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-708) Validate BCSID while reading blocks from containers in datanodes

2018-10-22 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-708:


 Summary: Validate BCSID while reading blocks from containers in 
datanodes
 Key: HDDS-708
 URL: https://issues.apache.org/jira/browse/HDDS-708
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee


Ozone client while making a getBlock call during reading data , should read the 
bcsId from OzoneManager for the block and the same needs to be validated in 
Datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-708) Validate BCSID while reading blocks from containers in datanodes

2018-10-22 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-708:
-
Attachment: HDDS-708.000.patch

> Validate BCSID while reading blocks from containers in datanodes
> 
>
> Key: HDDS-708
> URL: https://issues.apache.org/jira/browse/HDDS-708
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.3.0
>
> Attachments: HDDS-708.000.patch
>
>
> Ozone client while making a getBlock call during reading data , should read 
> the bcsId from OzoneManager for the block and the same needs to be validated 
> in Datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-22 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658831#comment-16658831
 ] 

Shashikant Banerjee commented on HDDS-676:
--

Thanks [~anu], for the review comments. I have created a new Jira HDDS-708 
which will have the changes required for reading and validating BCSID while 
reading the block from container db on datanodes.

Patch v7 attached has the changes required to enable read from open containers 
by using standalone grpc client.

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch, HDDS-676.005.patch, 
> HDDS-676.006.patch, HDDS-676.007.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-22 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-697:
-
Attachment: HDDS-697.000.patch

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch
>
>
> Similar , to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-709) Modify Close Container handling sequence on datanodes

2018-10-22 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-709:


 Summary: Modify Close Container handling sequence on datanodes
 Key: HDDS-709
 URL: https://issues.apache.org/jira/browse/HDDS-709
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Datanode
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee


With quasi closed container state for handling majority node failures, the 
close container handling sequence in Datanodes need to change. Once the 
datanodes receive a close container command from SCM, the open container 
replicas individually be marked in the closing state. In a closing state, only 
the transactions coming from the Ratis leader  are allowed , all other write 
transaction will fail. A close container transaction will be queued via Ratis 
on the leader which will be replayed to the followers which makes it transition 
to CLOSED/QUASI CLOSED state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-708) Validate BCSID while reading blocks from containers in datanodes

2018-10-22 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-708:
-
Status: Patch Available  (was: Open)

> Validate BCSID while reading blocks from containers in datanodes
> 
>
> Key: HDDS-708
> URL: https://issues.apache.org/jira/browse/HDDS-708
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.3.0
>
> Attachments: HDDS-708.000.patch
>
>
> Ozone client while making a getBlock call during reading data , should read 
> the bcsId from OzoneManager for the block and the same needs to be validated 
> in Datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-708) Validate BCSID while reading blocks from containers in datanodes

2018-10-22 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658832#comment-16658832
 ] 

Shashikant Banerjee commented on HDDS-708:
--

Patch v0 reads the BCSID from OzoneManager while doing a getBlock call to 
Datanode and it validates the BCSID in BlockManager on Datanode.

> Validate BCSID while reading blocks from containers in datanodes
> 
>
> Key: HDDS-708
> URL: https://issues.apache.org/jira/browse/HDDS-708
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.3.0
>
> Attachments: HDDS-708.000.patch
>
>
> Ozone client while making a getBlock call during reading data , should read 
> the bcsId from OzoneManager for the block and the same needs to be validated 
> in Datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-708) Validate BCSID while reading blocks from containers in datanodes

2018-10-22 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-708:
-
Fix Version/s: 0.3.0

> Validate BCSID while reading blocks from containers in datanodes
> 
>
> Key: HDDS-708
> URL: https://issues.apache.org/jira/browse/HDDS-708
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.3.0
>
>
> Ozone client while making a getBlock call during reading data , should read 
> the bcsId from OzoneManager for the block and the same needs to be validated 
> in Datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-22 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-676:
-
Attachment: HDDS-676.007.patch

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch, HDDS-676.005.patch, 
> HDDS-676.006.patch, HDDS-676.007.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-22 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659639#comment-16659639
 ] 

Shashikant Banerjee commented on HDDS-676:
--

Thanks[~anu], for the review comments.
{code:java}
I agree with this premise; that is we only talk to next data node if we get a 
failure on the first data node.

If that is the case, do we need all this Async framework changes, hash tables 
etc?
{code}
if we get a failure/connection issues with one of a datanode, we failover to 
the next datanode. If we don't maintain the state of the active channels for 
communication in the hash map, so that when we close the client we close all 
the conections. If we don't maintain the state, we need to close the 
connections in active read path as a part of handling the exception. Connection 
Errors can be transient.  Also, multiple ozone clients can use the same 
XceiverClient instance as we maintain a client cache, so immediately closing 
the connection in case one client op fails.

HashMap will also be helpful if we get the leader info cached, so that we will 
use that specific channel to execute first.

Regarding the async framework change, there is functionally no change in the 
code.It has been just split to 2 functions so while executing the command we 
execute on a specific channel.

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch, HDDS-676.005.patch, 
> HDDS-676.006.patch, HDDS-676.007.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-22 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659639#comment-16659639
 ] 

Shashikant Banerjee edited comment on HDDS-676 at 10/22/18 8:27 PM:


Thanks[~anu], for the review comments.
{code:java}
I agree with this premise; that is we only talk to next data node if we get a 
failure on the first data node.

If that is the case, do we need all this Async framework changes, hash tables 
etc?
{code}
if we get a failure/connection issues with one of a datanode, we failover to 
the next datanode. If we don't maintain the state of the active channels for 
communication in the hash map, so that when we close the client we close all 
the conections. we maintain the state, we need to close the connections in 
active read path as a part of handling the exception. Connection Errors can be 
transient.  Also, multiple ozone clients can use the same XceiverClient 
instance as we maintain a client cache, so immediately closing the connection 
in case one client op fails may not be good.

HashMap will also be helpful if we get the leader info cached, so that we will 
use that specific channel to execute first.

Regarding the async framework change, there is functionally no change in the 
code.It has been just split to 2 functions so while executing the command we 
execute on a specific channel.


was (Author: shashikant):
Thanks[~anu], for the review comments.
{code:java}
I agree with this premise; that is we only talk to next data node if we get a 
failure on the first data node.

If that is the case, do we need all this Async framework changes, hash tables 
etc?
{code}
if we get a failure/connection issues with one of a datanode, we failover to 
the next datanode. If we don't maintain the state of the active channels for 
communication in the hash map, so that when we close the client we close all 
the conections. If we don't maintain the state, we need to close the 
connections in active read path as a part of handling the exception. Connection 
Errors can be transient.  Also, multiple ozone clients can use the same 
XceiverClient instance as we maintain a client cache, so immediately closing 
the connection in case one client op fails.

HashMap will also be helpful if we get the leader info cached, so that we will 
use that specific channel to execute first.

Regarding the async framework change, there is functionally no change in the 
code.It has been just split to 2 functions so while executing the command we 
execute on a specific channel.

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch, HDDS-676.005.patch, 
> HDDS-676.006.patch, HDDS-676.007.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-22 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659639#comment-16659639
 ] 

Shashikant Banerjee edited comment on HDDS-676 at 10/22/18 8:58 PM:


Thanks[~anu], for the review comments.
{code:java}
I agree with this premise; that is we only talk to next data node if we get a 
failure on the first data node.

If that is the case, do we need all this Async framework changes, hash tables 
etc?
{code}
if we get a failure/connection issues with one of a datanode, we failover to 
the next datanode. We maintain the state of the active channels for 
communication in the hash map, so that when we close the client we close all 
the conections. we maintain the state, we need to close the connections in 
active read path as a part of handling the exception. Connection Errors can be 
transient.  Also, multiple ozone clients can use the same XceiverClient 
instance as we maintain a client cache, so immediately closing the connection 
in case one client op fails may not be good.

HashMap will also be helpful if we get the leader info cached, so that we will 
use that specific channel to execute first.

Regarding the async framework change, there is functionally no change in the 
code.It has been just split to 2 functions so while executing the command we 
execute on a specific channel.


was (Author: shashikant):
Thanks[~anu], for the review comments.
{code:java}
I agree with this premise; that is we only talk to next data node if we get a 
failure on the first data node.

If that is the case, do we need all this Async framework changes, hash tables 
etc?
{code}
if we get a failure/connection issues with one of a datanode, we failover to 
the next datanode. If we don't maintain the state of the active channels for 
communication in the hash map, so that when we close the client we close all 
the conections. we maintain the state, we need to close the connections in 
active read path as a part of handling the exception. Connection Errors can be 
transient.  Also, multiple ozone clients can use the same XceiverClient 
instance as we maintain a client cache, so immediately closing the connection 
in case one client op fails may not be good.

HashMap will also be helpful if we get the leader info cached, so that we will 
use that specific channel to execute first.

Regarding the async framework change, there is functionally no change in the 
code.It has been just split to 2 functions so while executing the command we 
execute on a specific channel.

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch, HDDS-676.005.patch, 
> HDDS-676.006.patch, HDDS-676.007.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-22 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659673#comment-16659673
 ] 

Shashikant Banerjee commented on HDDS-676:
--

Patch v8 addresses the review comments in *testPutKeyAndGetKey.*

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch, HDDS-676.005.patch, 
> HDDS-676.006.patch, HDDS-676.007.patch, HDDS-676.008.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-22 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-676:
-
Attachment: HDDS-676.008.patch

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch, HDDS-676.005.patch, 
> HDDS-676.006.patch, HDDS-676.007.patch, HDDS-676.008.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-708) Validate BCSID while reading blocks from containers in datanodes

2018-10-23 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-708:
-
Attachment: HDDS-708.001.patch

> Validate BCSID while reading blocks from containers in datanodes
> 
>
> Key: HDDS-708
> URL: https://issues.apache.org/jira/browse/HDDS-708
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.3.0
>
> Attachments: HDDS-708.000.patch, HDDS-708.001.patch
>
>
> Ozone client while making a getBlock call during reading data , should read 
> the bcsId from OzoneManager for the block and the same needs to be validated 
> in Datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-708) Validate BCSID while reading blocks from containers in datanodes

2018-10-23 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660157#comment-16660157
 ] 

Shashikant Banerjee commented on HDDS-708:
--

Thanks [~msingh], for the review comments. Patch v1 addresses your review 
comments.

> Validate BCSID while reading blocks from containers in datanodes
> 
>
> Key: HDDS-708
> URL: https://issues.apache.org/jira/browse/HDDS-708
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.3.0
>
> Attachments: HDDS-708.000.patch, HDDS-708.001.patch
>
>
> Ozone client while making a getBlock call during reading data , should read 
> the bcsId from OzoneManager for the block and the same needs to be validated 
> in Datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-19 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656786#comment-16656786
 ] 

Shashikant Banerjee commented on HDDS-676:
--

Test failures and fundbug are not related to the patch.

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-20 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-676:
-
Attachment: HDDS-676.006.patch

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch, HDDS-676.005.patch, HDDS-676.006.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-20 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657885#comment-16657885
 ] 

Shashikant Banerjee commented on HDDS-676:
--

patch v6 addresses the checkstyle issues.

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch, HDDS-676.005.patch, HDDS-676.006.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-17 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-676:
-
Attachment: HDDS-676.001.patch

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-17 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-676:
-
Status: Patch Available  (was: Open)

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-676.001.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-17 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-676:


 Summary: Enable Read from open Containers via Standalone Protocol
 Key: HDDS-676
 URL: https://issues.apache.org/jira/browse/HDDS-676
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee


With BlockCommitSequenceId getting updated per block commit on open containers 
in OM as well datanode, Ozone Client reads can through Standalone protocol not 
necessarily requiring Ratis. Client should verify the BCSID of the container 
which has the data block , which should always be greater than or equal to the 
BCSID of the block to be read and the existing block BCSID should exactly match 
that of the block to be read. As a part of this, Client can try to read from a 
replica with a supplied BCSID and failover to the next one in case the block 
does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-675) Add blocking buffer and use watchApi for flush/close in OzoneClient

2018-10-17 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-675:


 Summary: Add blocking buffer and use watchApi for flush/close in 
OzoneClient
 Key: HDDS-675
 URL: https://issues.apache.org/jira/browse/HDDS-675
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Client
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee


For handling 2 node failures, a blocking buffer will be used which will wait 
for the flush commit index to get updated on all replicas of a container via 
Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-629) Make ApplyTransaction calls in ContainerStateMachine idempotent

2018-10-15 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650533#comment-16650533
 ] 

Shashikant Banerjee commented on HDDS-629:
--

The test failures and the findbug warning are not related to the patch.

> Make ApplyTransaction calls in ContainerStateMachine idempotent
> ---
>
> Key: HDDS-629
> URL: https://issues.apache.org/jira/browse/HDDS-629
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-629.000.patch, HDDS-629.001.patch, 
> HDDS-629.002.patch, HDDS-629.003.patch, HDDS-629.004.patch, 
> HDDS-629.005.patch, HDDS-629.006.patch
>
>
> When a Datanode restarts, it may lead up to a case where it can reapply 
> already applied Transactions when it joins the pipeline again . For this 
> requirement, all ApplyTransaction calls in Ratis need to be made idempotent



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-708) Validate BCSID while reading blocks from containers in datanodes

2018-10-23 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-708:
-
   Resolution: Fixed
Fix Version/s: 0.4.0
   Status: Resolved  (was: Patch Available)

Thanks [~anu], [~msingh] for the review. I have committed this to trunk as well 
as ozone-0.3 branch.

> Validate BCSID while reading blocks from containers in datanodes
> 
>
> Key: HDDS-708
> URL: https://issues.apache.org/jira/browse/HDDS-708
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.3.0, 0.4.0
>
> Attachments: HDDS-708.000.patch, HDDS-708.001.patch
>
>
> Ozone client while making a getBlock call during reading data , should read 
> the bcsId from OzoneManager for the block and the same needs to be validated 
> in Datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-717) Add a test to write data on datanodes with higher bcsid and commit the key to OM with lower bcsid and then read

2018-10-23 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-717:
-
Summary: Add a test to write data on datanodes with higher bcsid and commit 
the key to OM with lower bcsid and then read  (was: Add a test to write data on 
datanodes with higher BCSID and commit the key to OM with lower bcsid and then 
read)

> Add a test to write data on datanodes with higher bcsid and commit the key to 
> OM with lower bcsid and then read
> ---
>
> Key: HDDS-717
> URL: https://issues.apache.org/jira/browse/HDDS-717
> Project: Hadoop Distributed Data Store
>  Issue Type: Test
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-717) Add a test to write data on datanodes with higher BCSID and commit the key to OM with lower bcsid and then read

2018-10-23 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-717:
-
Summary: Add a test to write data on datanodes with higher BCSID and commit 
the key to OM with lower bcsid and then read  (was: Add a test to write data on 
datanodes with higher BCSID and commit the key to OM with lower bcsid)

> Add a test to write data on datanodes with higher BCSID and commit the key to 
> OM with lower bcsid and then read
> ---
>
> Key: HDDS-717
> URL: https://issues.apache.org/jira/browse/HDDS-717
> Project: Hadoop Distributed Data Store
>  Issue Type: Test
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-23 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-676:
-
Status: Patch Available  (was: Reopened)

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-676-ozone-0.3.000.patch, HDDS-676.001.patch, 
> HDDS-676.002.patch, HDDS-676.003.patch, HDDS-676.004.patch, 
> HDDS-676.005.patch, HDDS-676.006.patch, HDDS-676.007.patch, HDDS-676.008.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-23 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-697:
-
Status: Patch Available  (was: Open)

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch
>
>
> Similar to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-717) Add a test to write data on datanodes with higher BCSID and commit the key to OM with lower bcsid

2018-10-23 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-717:


 Summary: Add a test to write data on datanodes with higher BCSID 
and commit the key to OM with lower bcsid
 Key: HDDS-717
 URL: https://issues.apache.org/jira/browse/HDDS-717
 Project: Hadoop Distributed Data Store
  Issue Type: Test
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-716) Update ozone to latest ratis snapshot build(0.3.0-aa38160-SNAPSHOT)

2018-10-24 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662200#comment-16662200
 ] 

Shashikant Banerjee commented on HDDS-716:
--

Thanks [~msingh] for working on this and [~jnp] for the review, I have 
committed this to trunk. The Jira needs a patch for ozone 0.3 branch as well.

> Update ozone to latest ratis snapshot build(0.3.0-aa38160-SNAPSHOT)
> ---
>
> Key: HDDS-716
> URL: https://issues.apache.org/jira/browse/HDDS-716
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: HDDS-716.001.patch, HDDS-716.002.patch, 
> HDDS-716.003.patch
>
>
> This jira updates the ozone to latest ratis snapshot 
> build(0.3.0-aa38160-SNAPSHOT)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-716) Update ozone to latest ratis snapshot build(0.3.0-aa38160-SNAPSHOT)

2018-10-24 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662035#comment-16662035
 ] 

Shashikant Banerjee commented on HDDS-716:
--

+1 on the latest patch. I will commit this shortly.

> Update ozone to latest ratis snapshot build(0.3.0-aa38160-SNAPSHOT)
> ---
>
> Key: HDDS-716
> URL: https://issues.apache.org/jira/browse/HDDS-716
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: HDDS-716.001.patch, HDDS-716.002.patch, 
> HDDS-716.003.patch
>
>
> This jira updates the ozone to latest ratis snapshot 
> build(0.3.0-aa38160-SNAPSHOT)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-726) Ozone Client should update SCM to move the container out of allocation path in case a write transaction fails

2018-10-24 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-726:


 Summary: Ozone Client should update SCM to move the container out 
of allocation path in case a write transaction fails
 Key: HDDS-726
 URL: https://issues.apache.org/jira/browse/HDDS-726
 Project: Hadoop Distributed Data Store
  Issue Type: Test
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee


Once an container write transaction fails, it will be marked corrupted. Once 
Ozone client gets an exception in such case it should tell SCM to move the 
container out of allocation path. SCM will eventually close the container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-728) Datanodes are going to dead state after some interval

2018-10-24 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662359#comment-16662359
 ] 

Shashikant Banerjee commented on HDDS-728:
--

[~ssulav], can you attach the SCM logs as well as logs for other Datanodes?

> Datanodes are going to dead state after some interval
> -
>
> Key: HDDS-728
> URL: https://issues.apache.org/jira/browse/HDDS-728
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Affects Versions: 0.3.0
>Reporter: Soumitra Sulav
>Priority: Major
> Attachments: 
> hadoop-root-datanode-ctr-e138-1518143905142-541600-02-03.hwx.site.log
>
>
> Setup a 5 datanode ozone cluster with HDP on top of it.
> After restarting all HDP services few times encountered below issue which is 
> making the HDP services to fail.
> Same exception was observed in an old setup but I thought it could have been 
> issue with the setup but now encountered the same issue in new setup as well.
> {code:java}
> 2018-10-24 10:42:03,308 WARN 
> org.apache.ratis.grpc.server.GrpcServerProtocolService: 
> 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 
> 1672d28e-800f-4318-895b-1648976acff6->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
> org.apache.ratis.protocol.GroupMismatchException: 
> 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
> at 
> org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
> at 
> org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
> at 
> org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
> at 
> org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2018-10-24 10:42:03,342 WARN 
> org.apache.ratis.grpc.server.GrpcServerProtocolService: 
> 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 
> 7839294e-5657-447f-b320-6b390fffb963->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
> org.apache.ratis.protocol.GroupMismatchException: 
> 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
> at 
> org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
> at 
> org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
> at 
> org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
> at 
> org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2018-10-24 

[jira] [Updated] (HDDS-676) Enable Read from open Containers via Standalone Protocol

2018-10-23 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-676:
-
  Resolution: Fixed
   Fix Version/s: 0.3.0
Target Version/s: 0.3.0, 0.4.0  (was: 0.3.0)
  Status: Resolved  (was: Patch Available)

Thanks [~nandakumar131], [~jnp], [~anu] for the reviews. I have committed this 
change to trunk and ozone-0.3 branch.

> Enable Read from open Containers via Standalone Protocol
> 
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.3.0, 0.4.0
>
> Attachments: HDDS-676-ozone-0.3.000.patch, HDDS-676.001.patch, 
> HDDS-676.002.patch, HDDS-676.003.patch, HDDS-676.004.patch, 
> HDDS-676.005.patch, HDDS-676.006.patch, HDDS-676.007.patch, HDDS-676.008.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-720) ContainerReportPublisher fails when the container is marked unhealthy on Datanodes

2018-10-23 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-720:


 Summary: ContainerReportPublisher fails when the container is 
marked unhealthy on Datanodes
 Key: HDDS-720
 URL: https://issues.apache.org/jira/browse/HDDS-720
 Project: Hadoop Distributed Data Store
  Issue Type: Test
  Components: Ozone Datanode, SCM
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee


{code:java}
2018-10-24 01:15:00,265 ERROR report.ReportPublisher 
(ReportPublisher.java:publishReport(88)) - Exception while publishing report.
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: 
Invalid Container state found: 2
at 
org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.getHddsState(KeyValueContainer.java:558)
at 
org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.getContainerReport(KeyValueContainer.java:532)
at 
org.apache.hadoop.ozone.container.common.impl.ContainerSet.getContainerReport(ContainerSet.java:203)
at 
org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getContainerReport(OzoneContainer.java:168)
at 
org.apache.hadoop.ozone.container.common.report.ContainerReportPublisher.getReport(ContainerReportPublisher.java:83)
at 
org.apache.hadoop.ozone.container.common.report.ContainerReportPublisher.getReport(ContainerReportPublisher.java:50)
at 
org.apache.hadoop.ozone.container.common.report.ReportPublisher.publishReport(ReportPublisher.java:86)
at 
org.apache.hadoop.ozone.container.common.report.ReportPublisher.run(ReportPublisher.java:73)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}
There is no mapping exist for Unhealthy state in Datanode for containers to 
LifecycleState of containers in SCM. Hence, the container report publisher 
fails with Invalid container state exception.

A container is marked unhealthy in Datanode only if a certain write transaction 
fails, so that successive updates get rejected and a close container action is 
initiated to SCM to close the container. For all practical cases, a container 
in unhealthy state can also be mapped to a container in closing state in SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-23 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-697:
-
Attachment: HDDS-697.001.patch

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch, HDDS-697.001.patch
>
>
> Similar to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-23 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661217#comment-16661217
 ] 

Shashikant Banerjee commented on HDDS-697:
--

Patch v1 fixes the test failures. 

testContainerStateMachineFailures seems like a flaky test where we mark the 
container in unhealthy state and wait for the closeContainerAction to be 
queued. But, before the assert call to verify whether action exists in the 
pending actions queue executes, in Datanode, it might get already removed from 
the action queue to be sent to SCM by the datanode. As a result of which, 
sometimes the test works sometimes doesn't. Removed the assert condition to 
verify the pending action queue from the test to make the test more stable.

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch, HDDS-697.001.patch
>
>
> Similar to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-23 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-697:
-
Attachment: (was: HDDS-697.001.patch)

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch
>
>
> Similar to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-23 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-697:
-
Status: Open  (was: Patch Available)

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch
>
>
> Similar to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-23 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-697:
-
Attachment: HDDS-697.001.patch

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch, HDDS-697.001.patch
>
>
> Similar to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-721) NullPointerException thrown while trying to read a file when datanode restarted

2018-10-26 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665141#comment-16665141
 ] 

Shashikant Banerjee edited comment on HDDS-721 at 10/26/18 1:03 PM:


The issue seems like not resolved with HDDS-676. SCM can return a Ratis 
pipeline to the client where the filed corresponding to the leaderId can be 
null. At client, only the Replication type is overwriiten to Type Stand_Alone. 
In case, leader datanode is not set, it will still fail with NULL pointer 
exception.


was (Author: shashikant):
The issue seems to get resolved with HDDS-676. SCM can return a Ratis pipeline 
to the client where the filed corresponding to the leaderId can be null. At 
client, only the Replication type is overwriiten to Type Stand_Alone. In case, 
leader datanode is not set, it will still fail with NULL pointer exception.

> NullPointerException thrown while trying to read a file when datanode 
> restarted
> ---
>
> Key: HDDS-721
> URL: https://issues.apache.org/jira/browse/HDDS-721
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nilotpal Nandi
>Priority: Critical
> Attachments: all-node-ozone-logs-1540356965.tar.gz
>
>
> steps taken :
> ---
>  # Put few files and directories using ozonefs
>  # stopped all services of cluster.
>  # started the scm, om and then datanodes.
> While datanodes were starting up, tried to read a file. Null pointer 
> Exception was thrown.
>  
> {noformat}
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -ls -R /
> 2018-10-24 04:48:00,703 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> drwxrwxrwx - root root 0 2018-10-24 04:12 /testdir1
> -rw-rw-rw- 1 root root 5368709120 1970-02-25 15:29 /testdir1/5GB
> -rw-rw-rw- 1 root root 4798 1970-02-25 15:22 /testdir1/passwd
> drwxrwxrwx - root root 0 2018-10-24 04:46 /testdir3
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -cat 
> o3fs://fs-bucket.fs-volume/testdir1/passwd
> 2018-10-24 04:49:24,955 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> cat: Exception getting XceiverClient: 
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.NullPointerException{noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-721) NullPointerException thrown while trying to read a file when datanode restarted

2018-10-26 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665141#comment-16665141
 ] 

Shashikant Banerjee commented on HDDS-721:
--

The issue seems to get resolved with HDDS-676. SCM can return a Ratis pipeline 
to the client where the filed corresponding to the leaderId can be null. At 
client, only the Replication type is overwriiten to Type Stand_Alone. In case, 
leader datanode is not set, it will still fail with NULL pointer exception.

> NullPointerException thrown while trying to read a file when datanode 
> restarted
> ---
>
> Key: HDDS-721
> URL: https://issues.apache.org/jira/browse/HDDS-721
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nilotpal Nandi
>Priority: Critical
> Attachments: all-node-ozone-logs-1540356965.tar.gz
>
>
> steps taken :
> ---
>  # Put few files and directories using ozonefs
>  # stopped all services of cluster.
>  # started the scm, om and then datanodes.
> While datanodes were starting up, tried to read a file. Null pointer 
> Exception was thrown.
>  
> {noformat}
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -ls -R /
> 2018-10-24 04:48:00,703 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> drwxrwxrwx - root root 0 2018-10-24 04:12 /testdir1
> -rw-rw-rw- 1 root root 5368709120 1970-02-25 15:29 /testdir1/5GB
> -rw-rw-rw- 1 root root 4798 1970-02-25 15:22 /testdir1/passwd
> drwxrwxrwx - root root 0 2018-10-24 04:46 /testdir3
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -cat 
> o3fs://fs-bucket.fs-volume/testdir1/passwd
> 2018-10-24 04:49:24,955 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> cat: Exception getting XceiverClient: 
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.NullPointerException{noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-721) NullPointerException thrown while trying to read a file when datanode restarted

2018-10-26 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee reassigned HDDS-721:


Assignee: Shashikant Banerjee

> NullPointerException thrown while trying to read a file when datanode 
> restarted
> ---
>
> Key: HDDS-721
> URL: https://issues.apache.org/jira/browse/HDDS-721
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nilotpal Nandi
>Assignee: Shashikant Banerjee
>Priority: Critical
> Attachments: all-node-ozone-logs-1540356965.tar.gz
>
>
> steps taken :
> ---
>  # Put few files and directories using ozonefs
>  # stopped all services of cluster.
>  # started the scm, om and then datanodes.
> While datanodes were starting up, tried to read a file. Null pointer 
> Exception was thrown.
>  
> {noformat}
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -ls -R /
> 2018-10-24 04:48:00,703 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> drwxrwxrwx - root root 0 2018-10-24 04:12 /testdir1
> -rw-rw-rw- 1 root root 5368709120 1970-02-25 15:29 /testdir1/5GB
> -rw-rw-rw- 1 root root 4798 1970-02-25 15:22 /testdir1/passwd
> drwxrwxrwx - root root 0 2018-10-24 04:46 /testdir3
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -cat 
> o3fs://fs-bucket.fs-volume/testdir1/passwd
> 2018-10-24 04:49:24,955 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> cat: Exception getting XceiverClient: 
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.NullPointerException{noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-721) NullPointerException thrown while trying to read a file when datanode restarted

2018-10-28 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-721:
-
Attachment: HDDS-721-ozone-0.3.000.patch

> NullPointerException thrown while trying to read a file when datanode 
> restarted
> ---
>
> Key: HDDS-721
> URL: https://issues.apache.org/jira/browse/HDDS-721
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nilotpal Nandi
>Assignee: Shashikant Banerjee
>Priority: Critical
> Attachments: HDDS-721-ozone-0.3.000.patch, 
> all-node-ozone-logs-1540356965.tar.gz
>
>
> steps taken :
> ---
>  # Put few files and directories using ozonefs
>  # stopped all services of cluster.
>  # started the scm, om and then datanodes.
> While datanodes were starting up, tried to read a file. Null pointer 
> Exception was thrown.
>  
> {noformat}
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -ls -R /
> 2018-10-24 04:48:00,703 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> drwxrwxrwx - root root 0 2018-10-24 04:12 /testdir1
> -rw-rw-rw- 1 root root 5368709120 1970-02-25 15:29 /testdir1/5GB
> -rw-rw-rw- 1 root root 4798 1970-02-25 15:22 /testdir1/passwd
> drwxrwxrwx - root root 0 2018-10-24 04:46 /testdir3
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -cat 
> o3fs://fs-bucket.fs-volume/testdir1/passwd
> 2018-10-24 04:49:24,955 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> cat: Exception getting XceiverClient: 
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.NullPointerException{noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-721) NullPointerException thrown while trying to read a file when datanode restarted

2018-10-28 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-721:
-
Status: Patch Available  (was: Open)

> NullPointerException thrown while trying to read a file when datanode 
> restarted
> ---
>
> Key: HDDS-721
> URL: https://issues.apache.org/jira/browse/HDDS-721
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nilotpal Nandi
>Assignee: Shashikant Banerjee
>Priority: Critical
> Attachments: HDDS-721-ozone-0.3.000.patch, 
> all-node-ozone-logs-1540356965.tar.gz
>
>
> steps taken :
> ---
>  # Put few files and directories using ozonefs
>  # stopped all services of cluster.
>  # started the scm, om and then datanodes.
> While datanodes were starting up, tried to read a file. Null pointer 
> Exception was thrown.
>  
> {noformat}
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -ls -R /
> 2018-10-24 04:48:00,703 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> drwxrwxrwx - root root 0 2018-10-24 04:12 /testdir1
> -rw-rw-rw- 1 root root 5368709120 1970-02-25 15:29 /testdir1/5GB
> -rw-rw-rw- 1 root root 4798 1970-02-25 15:22 /testdir1/passwd
> drwxrwxrwx - root root 0 2018-10-24 04:46 /testdir3
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -cat 
> o3fs://fs-bucket.fs-volume/testdir1/passwd
> 2018-10-24 04:49:24,955 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> cat: Exception getting XceiverClient: 
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.NullPointerException{noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-721) NullPointerException thrown while trying to read a file when datanode restarted

2018-10-28 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1470#comment-1470
 ] 

Shashikant Banerjee commented on HDDS-721:
--

This needs to be fixed in ozone 0.3 branch only. The problem should be fixed 
with HDDS-694 in trunk.

> NullPointerException thrown while trying to read a file when datanode 
> restarted
> ---
>
> Key: HDDS-721
> URL: https://issues.apache.org/jira/browse/HDDS-721
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nilotpal Nandi
>Assignee: Shashikant Banerjee
>Priority: Critical
> Attachments: HDDS-721-ozone-0.3.000.patch, 
> all-node-ozone-logs-1540356965.tar.gz
>
>
> steps taken :
> ---
>  # Put few files and directories using ozonefs
>  # stopped all services of cluster.
>  # started the scm, om and then datanodes.
> While datanodes were starting up, tried to read a file. Null pointer 
> Exception was thrown.
>  
> {noformat}
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -ls -R /
> 2018-10-24 04:48:00,703 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> drwxrwxrwx - root root 0 2018-10-24 04:12 /testdir1
> -rw-rw-rw- 1 root root 5368709120 1970-02-25 15:29 /testdir1/5GB
> -rw-rw-rw- 1 root root 4798 1970-02-25 15:22 /testdir1/passwd
> drwxrwxrwx - root root 0 2018-10-24 04:46 /testdir3
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -cat 
> o3fs://fs-bucket.fs-volume/testdir1/passwd
> 2018-10-24 04:49:24,955 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> cat: Exception getting XceiverClient: 
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.NullPointerException{noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-749) Restructure BlockId class in Ozone

2018-10-29 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667926#comment-16667926
 ] 

Shashikant Banerjee commented on HDDS-749:
--

Patch v1 fixes the related test failure and checkstyle issue.

> Restructure BlockId class in Ozone
> --
>
> Key: HDDS-749
> URL: https://issues.apache.org/jira/browse/HDDS-749
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-749.000.patch, HDDS-749.001.patch
>
>
> As a part of block allocation in SCM, SCM will return a containerBlockId 
> which constitutes of containerId and localId. Once OM gets the allocated 
> Blocks from SCM, it will create a BlockId object which constitutes of 
> containerID , localId and BlockCommitSequenceId.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-749) Restructure BlockId class in Ozone

2018-10-29 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-749:
-
Attachment: HDDS-749.001.patch

> Restructure BlockId class in Ozone
> --
>
> Key: HDDS-749
> URL: https://issues.apache.org/jira/browse/HDDS-749
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-749.000.patch, HDDS-749.001.patch
>
>
> As a part of block allocation in SCM, SCM will return a containerBlockId 
> which constitutes of containerId and localId. Once OM gets the allocated 
> Blocks from SCM, it will create a BlockId object which constitutes of 
> containerID , localId and BlockCommitSequenceId.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-721) NullPointerException thrown while trying to read a file when datanode restarted

2018-10-29 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-721:
-
Attachment: HDDS-721-ozone-0.3.001.patch

> NullPointerException thrown while trying to read a file when datanode 
> restarted
> ---
>
> Key: HDDS-721
> URL: https://issues.apache.org/jira/browse/HDDS-721
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nilotpal Nandi
>Assignee: Shashikant Banerjee
>Priority: Critical
> Attachments: HDDS-721-ozone-0.3.000.patch, 
> HDDS-721-ozone-0.3.001.patch, all-node-ozone-logs-1540356965.tar.gz
>
>
> steps taken :
> ---
>  # Put few files and directories using ozonefs
>  # stopped all services of cluster.
>  # started the scm, om and then datanodes.
> While datanodes were starting up, tried to read a file. Null pointer 
> Exception was thrown.
>  
> {noformat}
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -ls -R /
> 2018-10-24 04:48:00,703 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> drwxrwxrwx - root root 0 2018-10-24 04:12 /testdir1
> -rw-rw-rw- 1 root root 5368709120 1970-02-25 15:29 /testdir1/5GB
> -rw-rw-rw- 1 root root 4798 1970-02-25 15:22 /testdir1/passwd
> drwxrwxrwx - root root 0 2018-10-24 04:46 /testdir3
> [root@ctr-e138-1518143905142-53-01-03 ~]# 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -cat 
> o3fs://fs-bucket.fs-volume/testdir1/passwd
> 2018-10-24 04:49:24,955 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> cat: Exception getting XceiverClient: 
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.NullPointerException{noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-749) Restructure BlockId class in Ozone

2018-10-30 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-749:
-
Attachment: HDDS-749.002.patch

> Restructure BlockId class in Ozone
> --
>
> Key: HDDS-749
> URL: https://issues.apache.org/jira/browse/HDDS-749
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-749.000.patch, HDDS-749.001.patch, 
> HDDS-749.002.patch
>
>
> As a part of block allocation in SCM, SCM will return a containerBlockId 
> which constitutes of containerId and localId. Once OM gets the allocated 
> Blocks from SCM, it will create a BlockId object which constitutes of 
> containerID , localId and BlockCommitSequenceId.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-749) Restructure BlockId class in Ozone

2018-10-30 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668197#comment-16668197
 ] 

Shashikant Banerjee commented on HDDS-749:
--

Thanks [~jnp], for the review comments. Patch v2 addresses your review comments.

> Restructure BlockId class in Ozone
> --
>
> Key: HDDS-749
> URL: https://issues.apache.org/jira/browse/HDDS-749
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-749.000.patch, HDDS-749.001.patch, 
> HDDS-749.002.patch
>
>
> As a part of block allocation in SCM, SCM will return a containerBlockId 
> which constitutes of containerId and localId. Once OM gets the allocated 
> Blocks from SCM, it will create a BlockId object which constitutes of 
> containerID , localId and BlockCommitSequenceId.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-30 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668475#comment-16668475
 ] 

Shashikant Banerjee commented on HDDS-697:
--

Patch v2 is rebased to latest trunk.

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch, HDDS-697.001.patch, 
> HDDS-697.002.patch
>
>
> Similar to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-30 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-697:
-
Attachment: HDDS-697.002.patch

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch, HDDS-697.001.patch, 
> HDDS-697.002.patch
>
>
> Similar to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-774) Remove OpenContainerBlockMap from datanode

2018-10-31 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-774:


 Summary: Remove OpenContainerBlockMap from datanode
 Key: HDDS-774
 URL: https://issues.apache.org/jira/browse/HDDS-774
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Datanode
Affects Versions: 0.4.0
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee
 Fix For: 0.4.0


With HDDS-675, partial flush of uncommitted keys on Datanodes is not required. 
OpenContainerBlockMap hence serves no purpose anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-709) Modify Close Container handling sequence on datanodes

2018-10-31 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-709:
-
Attachment: HDDS-709.000.patch

> Modify Close Container handling sequence on datanodes
> -
>
> Key: HDDS-709
> URL: https://issues.apache.org/jira/browse/HDDS-709
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-709.000.patch
>
>
> With quasi closed container state for handling majority node failures, the 
> close container handling sequence in Datanodes need to change. Once the 
> datanodes receive a close container command from SCM, the open container 
> replicas individually be marked in the closing state. In a closing state, 
> only the transactions coming from the Ratis leader  are allowed , all other 
> write transaction will fail. A close container transaction will be queued via 
> Ratis on the leader which will be replayed to the followers which makes it 
> transition to CLOSED/QUASI CLOSED state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-709) Modify Close Container handling sequence on datanodes

2018-10-31 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-709:
-
Status: Patch Available  (was: Open)

> Modify Close Container handling sequence on datanodes
> -
>
> Key: HDDS-709
> URL: https://issues.apache.org/jira/browse/HDDS-709
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-709.000.patch
>
>
> With quasi closed container state for handling majority node failures, the 
> close container handling sequence in Datanodes need to change. Once the 
> datanodes receive a close container command from SCM, the open container 
> replicas individually be marked in the closing state. In a closing state, 
> only the transactions coming from the Ratis leader  are allowed , all other 
> write transaction will fail. A close container transaction will be queued via 
> Ratis on the leader which will be replayed to the followers which makes it 
> transition to CLOSED/QUASI CLOSED state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-675) Add blocking buffer and use watchApi for flush/close in OzoneClient

2018-10-31 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-675:
-
Attachment: HDDS-675.000.patch

> Add blocking buffer and use watchApi for flush/close in OzoneClient
> ---
>
> Key: HDDS-675
> URL: https://issues.apache.org/jira/browse/HDDS-675
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-675.000.patch
>
>
> For handling 2 node failures, a blocking buffer will be used which will wait 
> for the flush commit index to get updated on all replicas of a container via 
> Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-675) Add blocking buffer and use watchApi for flush/close in OzoneClient

2018-10-31 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-675:
-
Status: Patch Available  (was: Open)

> Add blocking buffer and use watchApi for flush/close in OzoneClient
> ---
>
> Key: HDDS-675
> URL: https://issues.apache.org/jira/browse/HDDS-675
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-675.000.patch
>
>
> For handling 2 node failures, a blocking buffer will be used which will wait 
> for the flush commit index to get updated on all replicas of a container via 
> Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-675) Add blocking buffer and use watchApi for flush/close in OzoneClient

2018-10-31 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670051#comment-16670051
 ] 

Shashikant Banerjee commented on HDDS-675:
--

updated 1st patch.

> Add blocking buffer and use watchApi for flush/close in OzoneClient
> ---
>
> Key: HDDS-675
> URL: https://issues.apache.org/jira/browse/HDDS-675
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-675.000.patch
>
>
> For handling 2 node failures, a blocking buffer will be used which will wait 
> for the flush commit index to get updated on all replicas of a container via 
> Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-774) Remove OpenContainerBlockMap from datanode

2018-10-31 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670121#comment-16670121
 ] 

Shashikant Banerjee commented on HDDS-774:
--

This is blocked on HDDS-675.

> Remove OpenContainerBlockMap from datanode
> --
>
> Key: HDDS-774
> URL: https://issues.apache.org/jira/browse/HDDS-774
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-774.000.patch
>
>
> With HDDS-675, partial flush of uncommitted keys on Datanodes is not 
> required. OpenContainerBlockMap hence serves no purpose anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-774) Remove OpenContainerBlockMap from datanode

2018-10-31 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-774:
-
Attachment: HDDS-774.000.patch

> Remove OpenContainerBlockMap from datanode
> --
>
> Key: HDDS-774
> URL: https://issues.apache.org/jira/browse/HDDS-774
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-774.000.patch
>
>
> With HDDS-675, partial flush of uncommitted keys on Datanodes is not 
> required. OpenContainerBlockMap hence serves no purpose anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-31 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-697:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks [~jnp] for the review. I have fixed the checkstyle issues and committed 
this change to trunk.

The test failures and ASF license warnings are not related to the patch.

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch, HDDS-697.001.patch, 
> HDDS-697.002.patch
>
>
> Similar to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-31 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-697:
-
Fix Version/s: 0.4.0

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-697.000.patch, HDDS-697.001.patch, 
> HDDS-697.002.patch
>
>
> Similar to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-709) Modify Close Container handling sequence on datanodes

2018-10-31 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-709:
-
Status: Open  (was: Patch Available)

> Modify Close Container handling sequence on datanodes
> -
>
> Key: HDDS-709
> URL: https://issues.apache.org/jira/browse/HDDS-709
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-709.000.patch
>
>
> With quasi closed container state for handling majority node failures, the 
> close container handling sequence in Datanodes need to change. Once the 
> datanodes receive a close container command from SCM, the open container 
> replicas individually be marked in the closing state. In a closing state, 
> only the transactions coming from the Ratis leader  are allowed , all other 
> write transaction will fail. A close container transaction will be queued via 
> Ratis on the leader which will be replayed to the followers which makes it 
> transition to CLOSED/QUASI CLOSED state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-675) Add blocking buffer and use watchApi for flush/close in OzoneClient

2018-11-01 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-675:
-
Attachment: HDDS-675.001.patch

> Add blocking buffer and use watchApi for flush/close in OzoneClient
> ---
>
> Key: HDDS-675
> URL: https://issues.apache.org/jira/browse/HDDS-675
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-675.000.patch, HDDS-675.001.patch
>
>
> For handling 2 node failures, a blocking buffer will be used which will wait 
> for the flush commit index to get updated on all replicas of a container via 
> Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-771) ChunkGroupOutputStream stream entries need to be properly updated on closed container exception

2018-11-01 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671391#comment-16671391
 ] 

Shashikant Banerjee commented on HDDS-771:
--

Thanks [~ljain], for the patch. The patch looks good to me. I am +1 on this.

> ChunkGroupOutputStream stream entries need to be properly updated on closed 
> container exception
> ---
>
> Key: HDDS-771
> URL: https://issues.apache.org/jira/browse/HDDS-771
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Blocker
> Attachments: HDDS-771-ozone-0.3.001.patch, HDDS-771.001.patch
>
>
> Currently ChunkGroupOutputStream does not increment the currentStreamIndex 
> when a chunk write completes but there is no data in the buffer. This leads 
> to overwriting of stream entry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-771) ChunkGroupOutputStream stream entries need to be properly updated on closed container exception

2018-11-01 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-771:
-
   Resolution: Fixed
Fix Version/s: 0.4.0
   0.3.0
   Status: Resolved  (was: Patch Available)

Thanks [~ljain] for the contribution. I have committed this change to trunk as 
well as ozone-0.3 branch.

> ChunkGroupOutputStream stream entries need to be properly updated on closed 
> container exception
> ---
>
> Key: HDDS-771
> URL: https://issues.apache.org/jira/browse/HDDS-771
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Blocker
> Fix For: 0.3.0, 0.4.0
>
> Attachments: HDDS-771-ozone-0.3.001.patch, HDDS-771.001.patch
>
>
> Currently ChunkGroupOutputStream does not increment the currentStreamIndex 
> when a chunk write completes but there is no data in the buffer. This leads 
> to overwriting of stream entry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-675) Add blocking buffer and use watchApi for flush/close in OzoneClient

2018-11-01 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671455#comment-16671455
 ] 

Shashikant Banerjee commented on HDDS-675:
--

Patch v1: Rebased to latest trunk.

> Add blocking buffer and use watchApi for flush/close in OzoneClient
> ---
>
> Key: HDDS-675
> URL: https://issues.apache.org/jira/browse/HDDS-675
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-675.000.patch, HDDS-675.001.patch
>
>
> For handling 2 node failures, a blocking buffer will be used which will wait 
> for the flush commit index to get updated on all replicas of a container via 
> Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-697) update and validate the BCSID for PutSmallFile/GetSmallFile command

2018-10-30 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-697:
-
Status: Patch Available  (was: Open)

> update and validate the BCSID for PutSmallFile/GetSmallFile command
> ---
>
> Key: HDDS-697
> URL: https://issues.apache.org/jira/browse/HDDS-697
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-697.000.patch, HDDS-697.001.patch, 
> HDDS-697.002.patch
>
>
> Similar to putBlock/GetBlock, putSmallFile transaction in Ratis needs to 
> update the BCSID in the container db on datanode. getSmallFile should 
> validate the bcsId while reading the block similar to getBlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-749) Restructure BlockId class in Ozone

2018-10-30 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668363#comment-16668363
 ] 

Shashikant Banerjee commented on HDDS-749:
--

The test failure reported is not related to the patch.

Thanks [~jnp], for the review. I have committed this change to trunk.

> Restructure BlockId class in Ozone
> --
>
> Key: HDDS-749
> URL: https://issues.apache.org/jira/browse/HDDS-749
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-749.000.patch, HDDS-749.001.patch, 
> HDDS-749.002.patch
>
>
> As a part of block allocation in SCM, SCM will return a containerBlockId 
> which constitutes of containerId and localId. Once OM gets the allocated 
> Blocks from SCM, it will create a BlockId object which constitutes of 
> containerID , localId and BlockCommitSequenceId.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-749) Restructure BlockId class in Ozone

2018-10-30 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-749:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Restructure BlockId class in Ozone
> --
>
> Key: HDDS-749
> URL: https://issues.apache.org/jira/browse/HDDS-749
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-749.000.patch, HDDS-749.001.patch, 
> HDDS-749.002.patch
>
>
> As a part of block allocation in SCM, SCM will return a containerBlockId 
> which constitutes of containerId and localId. Once OM gets the allocated 
> Blocks from SCM, it will create a BlockId object which constitutes of 
> containerID , localId and BlockCommitSequenceId.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-728) Datanodes are going to dead state after some interval

2018-10-25 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663767#comment-16663767
 ] 

Shashikant Banerjee commented on HDDS-728:
--

Thanks [~msingh] for the patch. The patch looks good to me as well. In addition 
to Nanda's comments:

I think its better to have the executor service array in containerStateMachine 
itself and shut it down during close. Since, we are now passing an array 
reference over containerStateMachine constructor, it may give a findbug warning 
as well.

> Datanodes are going to dead state after some interval
> -
>
> Key: HDDS-728
> URL: https://issues.apache.org/jira/browse/HDDS-728
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Affects Versions: 0.3.0
>Reporter: Soumitra Sulav
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: HDDS-728.001.patch, 
> hadoop-root-datanode-ctr-e138-1518143905142-541600-02-02.hwx.site.log, 
> hadoop-root-datanode-ctr-e138-1518143905142-541600-02-03.hwx.site.log, 
> hadoop-root-om-ctr-e138-1518143905142-541600-02-02.hwx.site.log, 
> hadoop-root-scm-ctr-e138-1518143905142-541600-02-02.hwx.site.log, 
> om-audit-ctr-e138-1518143905142-541600-02-02.hwx.site.log
>
>
> Setup a 5 datanode ozone cluster with HDP on top of it.
> After restarting all HDP services few times encountered below issue which is 
> making the HDP services to fail.
> Same exception was observed in an old setup but I thought it could have been 
> issue with the setup but now encountered the same issue in new setup as well.
> {code:java}
> 2018-10-24 10:42:03,308 WARN 
> org.apache.ratis.grpc.server.GrpcServerProtocolService: 
> 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 
> 1672d28e-800f-4318-895b-1648976acff6->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
> org.apache.ratis.protocol.GroupMismatchException: 
> 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
> at 
> org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
> at 
> org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
> at 
> org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
> at 
> org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2018-10-24 10:42:03,342 WARN 
> org.apache.ratis.grpc.server.GrpcServerProtocolService: 
> 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 
> 7839294e-5657-447f-b320-6b390fffb963->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
> org.apache.ratis.protocol.GroupMismatchException: 
> 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
> at 
> org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
> at 
> org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
> at 
> org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
> at 
> org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
> at 
> 

[jira] [Commented] (HDDS-799) writeStateMachineData times out

2018-11-03 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16674276#comment-16674276
 ] 

Shashikant Banerjee commented on HDDS-799:
--

Thanks [~msingh], for the patch. The test failures are related. Can you please 
check?

> writeStateMachineData times out
> ---
>
> Key: HDDS-799
> URL: https://issues.apache.org/jira/browse/HDDS-799
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Nilotpal Nandi
>Assignee: Mukul Kumar Singh
>Priority: Blocker
> Fix For: 0.3.0
>
> Attachments: HDDS-799-ozone-0.3.001.patch, 
> all-node-ozone-logs-1540979056.tar.gz
>
>
> datanode stopped due to following error :
> datanode.log
> {noformat}
> 2018-10-31 09:12:04,517 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 9fab9937-fbcd-4196-8014-cb165045724b: set configuration 169: 
> [9fab9937-fbcd-4196-8014-cb165045724b:172.27.15.131:9858, 
> ce0084c2-97cd-4c97-9378-e5175daad18b:172.27.15.139:9858, 
> f0291cb4-7a48-456a-847f-9f91a12aa850:172.27.38.9:9858], old=null at 169
> 2018-10-31 09:12:22,187 ERROR org.apache.ratis.server.storage.RaftLogWorker: 
> Terminating with exit status 1: 
> 9fab9937-fbcd-4196-8014-cb165045724b-RaftLogWorker failed.
> org.apache.ratis.protocol.TimeoutIOException: Timeout: WriteLog:182: (t:10, 
> i:182), STATEMACHINELOGENTRY, client-611073BBFA46, 
> cid=127-writeStateMachineData
>  at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:87)
>  at 
> org.apache.ratis.server.storage.RaftLogWorker$WriteLog.execute(RaftLogWorker.java:310)
>  at org.apache.ratis.server.storage.RaftLogWorker.run(RaftLogWorker.java:182)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.TimeoutException
>  at 
> java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771)
>  at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
>  at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:79)
>  ... 3 more{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-799) writeStateMachineData times out

2018-11-04 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16674341#comment-16674341
 ] 

Shashikant Banerjee commented on HDDS-799:
--

Thanks [~msingh] for updating the patch. I am +1 on this. Can you please 
provide a patch for trunk as well?

> writeStateMachineData times out
> ---
>
> Key: HDDS-799
> URL: https://issues.apache.org/jira/browse/HDDS-799
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Nilotpal Nandi
>Assignee: Mukul Kumar Singh
>Priority: Blocker
> Fix For: 0.3.0
>
> Attachments: HDDS-799-ozone-0.3.001.patch, 
> HDDS-799-ozone-0.3.002.patch, all-node-ozone-logs-1540979056.tar.gz
>
>
> datanode stopped due to following error :
> datanode.log
> {noformat}
> 2018-10-31 09:12:04,517 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 9fab9937-fbcd-4196-8014-cb165045724b: set configuration 169: 
> [9fab9937-fbcd-4196-8014-cb165045724b:172.27.15.131:9858, 
> ce0084c2-97cd-4c97-9378-e5175daad18b:172.27.15.139:9858, 
> f0291cb4-7a48-456a-847f-9f91a12aa850:172.27.38.9:9858], old=null at 169
> 2018-10-31 09:12:22,187 ERROR org.apache.ratis.server.storage.RaftLogWorker: 
> Terminating with exit status 1: 
> 9fab9937-fbcd-4196-8014-cb165045724b-RaftLogWorker failed.
> org.apache.ratis.protocol.TimeoutIOException: Timeout: WriteLog:182: (t:10, 
> i:182), STATEMACHINELOGENTRY, client-611073BBFA46, 
> cid=127-writeStateMachineData
>  at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:87)
>  at 
> org.apache.ratis.server.storage.RaftLogWorker$WriteLog.execute(RaftLogWorker.java:310)
>  at org.apache.ratis.server.storage.RaftLogWorker.run(RaftLogWorker.java:182)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.TimeoutException
>  at 
> java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771)
>  at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
>  at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:79)
>  ... 3 more{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-794) Add configs to set StateMachineData write timeout in ContainerStateMachine

2018-11-04 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16674452#comment-16674452
 ] 

Shashikant Banerjee commented on HDDS-794:
--

Thanks [~msingh], for the review. Patch v1 addresses your review comments.

> Add configs to set StateMachineData write timeout in ContainerStateMachine
> --
>
> Key: HDDS-794
> URL: https://issues.apache.org/jira/browse/HDDS-794
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.3.0, 0.4.0
>
> Attachments: HDDS-794.000.patch, HDDS-794.001.patch
>
>
> The patch will address adding config settings in Ozone which will 
> enable/disable timeout for StateMachineData write via Ratis. It also adds 
> some debug logs in writeChunk handling path inside datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-794) Add configs to set StateMachineData write timeout in ContainerStateMachine

2018-11-04 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-794:
-
Attachment: HDDS-794.001.patch

> Add configs to set StateMachineData write timeout in ContainerStateMachine
> --
>
> Key: HDDS-794
> URL: https://issues.apache.org/jira/browse/HDDS-794
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.3.0, 0.4.0
>
> Attachments: HDDS-794.000.patch, HDDS-794.001.patch
>
>
> The patch will address adding config settings in Ozone which will 
> enable/disable timeout for StateMachineData write via Ratis. It also adds 
> some debug logs in writeChunk handling path inside datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-794) Add configs to set StateMachineData write timeout in ContainerStateMachine

2018-11-02 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-794:


 Summary: Add configs to set StateMachineData write timeout in 
ContainerStateMachine
 Key: HDDS-794
 URL: https://issues.apache.org/jira/browse/HDDS-794
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Datanode
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee
 Fix For: 0.3.0, 0.4.0


The patch will address adding config settings in Ozone which will 
enable/disable timeout for StateMachineData write via Ratis. It also adds some 
debug logs in writeChunk handling path inside datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



<    6   7   8   9   10   11   12   13   14   15   >