[jira] [Comment Edited] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container

2018-02-27 Thread Shashikant Banerjee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16378646#comment-16378646
 ] 

Shashikant Banerjee edited comment on HDFS-12794 at 2/27/18 2:26 PM:
-

Patch v11 adds some tests to write keys and validate writes by setting the 
chunk and streamBufferSize.


was (Author: shashikant):
Patch adds some tests to write keys and validate writes by setting the chunk 
and streamBufferSize.

> Ozone: Parallelize ChunkOutputSream Writes to container
> ---
>
> Key: HDFS-12794
> URL: https://issues.apache.org/jira/browse/HDFS-12794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-12794-HDFS-7240.001.patch, 
> HDFS-12794-HDFS-7240.002.patch, HDFS-12794-HDFS-7240.003.patch, 
> HDFS-12794-HDFS-7240.004.patch, HDFS-12794-HDFS-7240.005.patch, 
> HDFS-12794-HDFS-7240.006.patch, HDFS-12794-HDFS-7240.007.patch, 
> HDFS-12794-HDFS-7240.008.patch, HDFS-12794-HDFS-7240.009.patch, 
> HDFS-12794-HDFS-7240.010.patch, HDFS-12794-HDFS-7240.011.patch
>
>
> The chunkOutPutStream Write are sync in nature .Once one chunk of data gets 
> written, the next chunk write is blocked until the previous chunk is written 
> to the container.
> The ChunkOutputWrite Stream writes should be made async and Close on the 
> OutputStream should ensure flushing of all dirty buffers to the container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container

2018-01-08 Thread Shashikant Banerjee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16316122#comment-16316122
 ] 

Shashikant Banerjee edited comment on HDFS-12794 at 1/8/18 11:13 AM:
-

Thanks [~xyao], for the review comments.

{code}
Line 139-166: in the new code, the rollback will position the buffer to the 
very beginning of all async operations even though some of them may succeed 
while the previous sequential write allows partial rollback. Any thoughts on 
the tradeoff here? The parallel steam write is beneficial only for certain use 
cases (e.g., large files)?
{code}

I agree that the previous code had the flexibility of partial rollback which 
the current patch doesn't have.
{code}
public void write(byte b[], int off, int len) throws IOException {
{code}

If the write over an java output stream fails , write request from the client 
fails and accordingly the client can retry the write op starting from the 
original offset. In case of an exception during the write on the stream, 
rollback to initial offset should work.

Given  that the write API is void in nature, we are not letting the client know 
in any case at what exact point, the failure happened. Partial rollback can 
benefit in case the code retries the to write the chunk internally in case of 
certain failures which is not the case here. In such situations, rollback to 
the initial offset should be the usual action. What do you think ?

{code}
Line 337: should we put offset here or remove the offset parameter assuming 
this always starts with ByteBuffer offset 0.
{code}

ChunkInfo contains the field "offset" which is used in a lot of places. I think 
we can keep it here,
and do the change in a different jira if at all this is required.

Patch v7 addresses the remaining review comments. Please have a look.



was (Author: shashikant):
Thanks [~xyao], for the review comments.

{code}
Line 139-166: in the new code, the rollback will position the buffer to the 
very beginning of all async operations even though some of them may succeed 
while the previous sequential write allows partial rollback. Any thoughts on 
the tradeoff here? The parallel steam write is beneficial only for certain use 
cases (e.g., large files)?
{code}

I agree that the previous code had the flexibility of partial rollback which 
the current patch doesn't have.
{code}
public void write(byte b[], int off, int len) throws IOException {
{code}

If the write over an java output stream fails , write request from the client 
fails and accordingly the client can retry the write op starting from the 
original offset. In case of an exception during the write on the stream, 
rollback to initial offset should work.

Given  that the write API is void in nature, we are not letting the client know 
in any case at what exact point, the failure happened. Partial rollback can 
benefit in case the code retries the to write the chunk internally in case of 
certain failures which is not the case here. In such situations, rollback to 
the initial offset should be the usual action. What do you think ?

{code}
Line 337: should we put offset here or remove the offset parameter assuming 
this always starts with ByteBuffer offset 0.
{code}

ChunkInfo contains the filed "offset" which is used in a lot of places. I think 
we can keep it here,
and do the change in a different jira if at all this is required.

Patch v7 addresses the remaining review comments. Please have a look.


> Ozone: Parallelize ChunkOutputSream Writes to container
> ---
>
> Key: HDFS-12794
> URL: https://issues.apache.org/jira/browse/HDFS-12794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
> Fix For: HDFS-7240
>
> Attachments: HDFS-12794-HDFS-7240.001.patch, 
> HDFS-12794-HDFS-7240.002.patch, HDFS-12794-HDFS-7240.003.patch, 
> HDFS-12794-HDFS-7240.004.patch, HDFS-12794-HDFS-7240.005.patch, 
> HDFS-12794-HDFS-7240.006.patch, HDFS-12794-HDFS-7240.007.patch
>
>
> The chunkOutPutStream Write are sync in nature .Once one chunk of data gets 
> written, the next chunk write is blocked until the previous chunk is written 
> to the container.
> The ChunkOutputWrite Stream writes should be made async and Close on the 
> OutputStream should ensure flushing of all dirty buffers to the container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container

2018-01-08 Thread Shashikant Banerjee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16316122#comment-16316122
 ] 

Shashikant Banerjee edited comment on HDFS-12794 at 1/8/18 11:13 AM:
-

Thanks [~xyao], for the review comments.

{code}
Line 139-166: in the new code, the rollback will position the buffer to the 
very beginning of all async operations even though some of them may succeed 
while the previous sequential write allows partial rollback. Any thoughts on 
the tradeoff here? The parallel steam write is beneficial only for certain use 
cases (e.g., large files)?
{code}

I agree that the previous code had the flexibility of partial rollback which 
the current patch doesn't have.
{code}
public void write(byte b[], int off, int len) throws IOException {
{code}

If the write over an java output stream fails , write request from the client 
fails and accordingly the client can retry the write op starting from the 
original offset. In case of an exception during the write on the stream, 
rollback to initial offset should work.

Given  that the write API is void in nature, we are not letting the client know 
in any case at what exact point, the failure happened. Partial rollback can 
benefit in case the code retries the to write the chunk internally in case of 
certain failures which is not the case here. In such situations, rollback to 
the initial offset should be the usual action. What do you think ?

{code}
Line 337: should we put offset here or remove the offset parameter assuming 
this always starts with ByteBuffer offset 0.
{code}

ChunkInfo contains the filed "offset" which is used in a lot of places. I think 
we can keep it here,
and do the change in a different jira if at all this is required.

Patch v7 addresses the remaining review comments. Please have a look.



was (Author: shashikant):
Thanks [~xyao], for the review comments.

{code}
Line 139-166: in the new code, the rollback will position the buffer to the 
very beginning of all async operations even though some of them may succeed 
while the previous sequential write allows partial rollback. Any thoughts on 
the tradeoff here? The parallel steam write is beneficial only for certain use 
cases (e.g., large files)?
{code}

I agree that the previous code had the flexibility of partial rollback which 
the current patch doesn't have.
{code}
public void write(byte b[], int off, int len) throws IOException {
{code}

If the write over an java output stream fails , write request from the client 
fails and accordingly the client can retry the write op starting from the 
original offset. In case of an exception during the write on the stream, 
rollback to initial offset should work.

Given  that the write API is void in nature, we are not letting the client know 
in any case at what exact point, the failure happened. Partial rollback can 
benefit in case the code retries the to write the chunk internally in case of 
certain failures which is not the case here. In such situations, rollback to 
the initial offset should be the usual action. What do you think ?

{code}
Line 337: should we put offset here or remove the offset parameter assuming 
this always starts with ByteBuffer offset 0.
{code}

ChunkInfo contains the filed offset which is used in a lot of places. I think 
we can keep it here,
and do the change in a different jira if at all this is required.

Patch v7 addresses the remaining review comments. Please have a look.


> Ozone: Parallelize ChunkOutputSream Writes to container
> ---
>
> Key: HDFS-12794
> URL: https://issues.apache.org/jira/browse/HDFS-12794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
> Fix For: HDFS-7240
>
> Attachments: HDFS-12794-HDFS-7240.001.patch, 
> HDFS-12794-HDFS-7240.002.patch, HDFS-12794-HDFS-7240.003.patch, 
> HDFS-12794-HDFS-7240.004.patch, HDFS-12794-HDFS-7240.005.patch, 
> HDFS-12794-HDFS-7240.006.patch, HDFS-12794-HDFS-7240.007.patch
>
>
> The chunkOutPutStream Write are sync in nature .Once one chunk of data gets 
> written, the next chunk write is blocked until the previous chunk is written 
> to the container.
> The ChunkOutputWrite Stream writes should be made async and Close on the 
> OutputStream should ensure flushing of all dirty buffers to the container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container

2018-01-05 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314013#comment-16314013
 ] 

Xiaoyu Yao edited comment on HDFS-12794 at 1/5/18 10:21 PM:


Thanks [~shashikant] for the patch and all for the reviews/discussion. The 
latest patch looks good to me overall. I just have a few minor issues.

*ChunkGroupOutputStream.java*
Suggest rename the blockSize to streamBufferSize here and use a separate 
configuration key to allow customization. 
We still can set it the same as SCM block size by DistributedStorageHandler by 
default. 

*ChunkOutputStream.java*
Line 74: NIT: chukIndex can be removed as we are replacing it with a timestamp.

Line 87: should we rename the parameter to avoid confusion. It seems to be more 
like a maxBufferSize. (Same applies to the change in 
ChunkGroupOutputStream.java that plumbs the blockSize parameter.

Line 139-166: in the new code, the rollback will position the buffer to the 
very beginning of all async operations even though some of them may succeed 
while the previous sequential write allows partial rollback. Any thoughts on 
the tradeoff here? The parallel steam write is beneficial only for  certain use 
cases (e.g., large files)? 

Line 196: NIT: Need to clarify the comments

Line 337: should we put offset here or remove the offset parameter assuming 
this always starts with ByteBuffer offset 0.

*ContainerProtocolCalls.java*
Line 56: NIT: unused imports


was (Author: xyao):
Thanks [~shashikant] for the patch and all for the reviews/discussion. The 
latest patch looks good to me overall. I just have a few minor issues.

ChunkGroupOutputStream.java
Suggest rename the blockSize to streamBufferSize here and use a separate 
configuration key to allow customization. 
We still can set it the same as SCM block size by DistributedStorageHandler by 
default. 

ChunkOutputStream.java
Line 74: NIT: chukIndex can be removed as we are replacing it with a timestamp.

Line 87: should we rename the parameter to avoid confusion. It seems to be more 
like a maxBufferSize. (Same applies to the change in 
ChunkGroupOutputStream.java that plumbs the blockSize parameter.

Line 139-166: in the new code, the rollback will position the buffer to the 
very beginning of all async operations even though some of them may succeed 
while the previous sequential write allows partial rollback. Any thoughts on 
the tradeoff here? The parallel steam write is beneficial only for  certain use 
cases (e.g., large files)? 

Line 196: NIT: Need to clarify the comments

Line 337: should we put offset here or remove the offset parameter assuming 
this always starts with ByteBuffer offset 0.

ContainerProtocolCalls.java
Line 56: NIT: unused imports

> Ozone: Parallelize ChunkOutputSream Writes to container
> ---
>
> Key: HDFS-12794
> URL: https://issues.apache.org/jira/browse/HDFS-12794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
> Fix For: HDFS-7240
>
> Attachments: HDFS-12794-HDFS-7240.001.patch, 
> HDFS-12794-HDFS-7240.002.patch, HDFS-12794-HDFS-7240.003.patch, 
> HDFS-12794-HDFS-7240.004.patch, HDFS-12794-HDFS-7240.005.patch, 
> HDFS-12794-HDFS-7240.006.patch
>
>
> The chunkOutPutStream Write are sync in nature .Once one chunk of data gets 
> written, the next chunk write is blocked until the previous chunk is written 
> to the container.
> The ChunkOutputWrite Stream writes should be made async and Close on the 
> OutputStream should ensure flushing of all dirty buffers to the container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container

2017-11-21 Thread Mukul Kumar Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260437#comment-16260437
 ] 

Mukul Kumar Singh edited comment on HDFS-12794 at 11/21/17 8:50 AM:


Thanks for the updated patch [~shashikant], Some minor comments on the latest 
v3 patch.

1) ChunkGroupOutputStream : 297 please align the comments
2) ChunkGroupOutputStream: 423, I feel we should not create a new semaphore 
there, is there a case where the semaphore will be passed as null?
3) ContainerProtocolCalls.java:188-220, should the writeChunk Api use the the 
writeChunkAsync and then do a future on get ?
4) Please fix the checkstyle errors.
5) Unit test failure of 
org.apache.hadoop.ozone.TestOzoneConfigurationFields.testCompareConfigurationClassAgainstXml
 is related. Please have a look.


was (Author: msingh):
Thanks for the updated patch [~shashikant], 

1) ChunkGroupOutputStream : 297 please align the comments
2) ChunkGroupOutputStream: 423, I feel we should not create a new semaphore 
there, is there a case where the semaphore will be passed as null?
3) ContainerProtocolCalls.java:188-220, should the writeChunk Api use the the 
writeChunkAsync and then do a future on get ?
4) Please fix the checkstyle errors.
5) Unit test failure of 
org.apache.hadoop.ozone.TestOzoneConfigurationFields.testCompareConfigurationClassAgainstXml
 is related. Please have a look.

> Ozone: Parallelize ChunkOutputSream Writes to container
> ---
>
> Key: HDFS-12794
> URL: https://issues.apache.org/jira/browse/HDFS-12794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
> Fix For: HDFS-7240
>
> Attachments: HDFS-12794-HDFS-7240.001.patch, 
> HDFS-12794-HDFS-7240.002.patch, HDFS-12794-HDFS-7240.003.patch
>
>
> The chunkOutPutStream Write are sync in nature .Once one chunk of data gets 
> written, the next chunk write is blocked until the previous chunk is written 
> to the container.
> The ChunkOutputWrite Stream writes should be made async and Close on the 
> OutputStream should ensure flushing of all dirty buffers to the container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container

2017-11-20 Thread Shashikant Banerjee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259728#comment-16259728
 ] 

Shashikant Banerjee edited comment on HDFS-12794 at 11/20/17 7:58 PM:
--

Thanks [~anu] , for the review comments.
As per discussion with [~anu], here are few conclusions:
1)
{code}
//make sure all the data in the ChunkoutputStreams is written to the
  //  container
  Preconditions.checkArgument(
  semaphore.availablePermits() == getMaxOutstandingChunks());
}
{code}

While doing close on the groupOutputStream, we do chunkOutputstream.close, 
where we do future.get() on response obtained after the write completes from 
the xceiver server which makes sure the response is received from the xceiver 
server. While closing the groupStream, semaphorePermiCount should be equal to 
no of available permits which is equal to max no of outstanding chunks at any 
given point of time.


2.  {code}
throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
}
{code}
Hardcoding the exception when the writeChunkToConatiner calls completes in the 
xceiverServer shows that , the exception is caught in the 
chunkoutputGroupStream.close path which is expected.

{code}
  response = response.thenApply(reply -> {
try{
  throw new IOException("Exception while validating response");
 // ContainerProtocolCalls.validateContainerResponse(reply);
 // return reply;
}catch (IOException e){
  throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e)
{code}
{code}
java.io.IOException: Unexpected Storage Container Exception: 
java.util.concurrent.ExecutionException: java.io.IOException: Exception while 
validating response

  at 
org.apache.hadoop.scm.storage.ChunkOutputStream.close(ChunkOutputStream.java:174)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream$ChunkOutputStreamEntry.close(ChunkGroupOutputStream.java:468)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.close(ChunkGroupOutputStream.java:291)
{code}

This is as expected.  Idea was to write a mocktest while 
validatingContainerResposne calls which is static method of a final class, and 
this requires powerMockrunner which leads to issues while bringing up the 
miniOzoneCluster.Will address the unit test to vertify the same later in a 
different jira.

Patch v3 addresses the remaining review comments.
[~anu]/others, please have a look.



was (Author: shashikant):
Thanks [~anu] , for the review comments.
As per discussion with [~anu], here are few conclusions:
1)
{code}
//make sure all the data in the ChunkoutputStreams is written to the
  //  container
  Preconditions.checkArgument(
  semaphore.availablePermits() == getMaxOutstandingChunks());
}
{code}

While doing close on the groupOutputStream, we do chunkOutputstream.close, 
where we do future.get() on response obtained after the write completes from 
the xceiver server which makes sure the response is received from the xceiver 
server. While closing the groupStream, semaphorePermiCount should be equal to 
no of available permits which is equal to max no of outstanding chunks at any 
given point of time.


2.  {code}
throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
}
{code}
Hardcoding the exception when the writeChunkToConatiner calls completes in the 
xceiverServer shows that , the exception is caught in the 
chunkoutputGroupStream.close path which is expected.

{code}
  response = response.thenApply(reply -> {
try{
  throw new IOException("Exception while validating response");
 // ContainerProtocolCalls.validateContainerResponse(reply);
 // return reply;
}catch (IOException e){
  throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e)
{code}
java.io.IOException: Unexpected Storage Container Exception: 
java.util.concurrent.ExecutionException: java.io.IOException: Exception while 
validating response

  at 
org.apache.hadoop.scm.storage.ChunkOutputStream.close(ChunkOutputStream.java:174)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream$ChunkOutputStreamEntry.close(ChunkGroupOutputStream.java:468)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.close(ChunkGroupOutputStream.java:291)
{code}

This is as expected.  Idea was to write a mocktest while 
validatingContainerResposne calls which is static method of a final class, and 
this requires powerMockrunner which leads to issues while bringing up the 
miniOzoneCluster.Will address the unit test to vertify the same later in a 
different jira.

Patch v3 addresses the remaining review comments.
[~anu]/others, please have a look.


> Ozone: Parallelize 

[jira] [Comment Edited] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container

2017-11-20 Thread Shashikant Banerjee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259728#comment-16259728
 ] 

Shashikant Banerjee edited comment on HDFS-12794 at 11/20/17 7:57 PM:
--

Thanks [~anu] , for the review comments.
As per discussion with [~anu], here are few conclusions:
1)
{code}
//make sure all the data in the ChunkoutputStreams is written to the
  //  container
  Preconditions.checkArgument(
  semaphore.availablePermits() == getMaxOutstandingChunks());
}
{code}

While doing close on the groupOutputStream, we do chunkOutputstream.close, 
where we do future.get() on response obtained after the write completes from 
the xceiver server which makes sure the response is received from the xceiver 
server. While closing the groupStream, semaphorePermiCount should be equal to 
no of available permits which is equal to max no of outstanding chunks at any 
given point of time.


2.  {code}
throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
}
{code}
Hardcoding the exception when the writeChunkToConatiner calls completes in the 
xceiverServer shows that , the exception is caught in the 
chunkoutputGroupStream.close path which is expected.

{code}
  response = response.thenApply(reply -> {
try{
  throw new IOException("Exception while validating response");
 // ContainerProtocolCalls.validateContainerResponse(reply);
 // return reply;
}catch (IOException e){
  throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e)
{code}
java.io.IOException: Unexpected Storage Container Exception: 
java.util.concurrent.ExecutionException: java.io.IOException: Exception while 
validating response

  at 
org.apache.hadoop.scm.storage.ChunkOutputStream.close(ChunkOutputStream.java:174)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream$ChunkOutputStreamEntry.close(ChunkGroupOutputStream.java:468)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.close(ChunkGroupOutputStream.java:291)
{code}

This is as expected.  Idea was to write a mocktest while 
validatingContainerResposne calls which is static method of a final class, and 
this requires powerMockrunner which leads to issues while bringing up the 
miniOzoneCluster.Will address the unit test to vertify the same later in a 
different jira.

Patch v3 addresses the remaining review comments.
[~anu]/others, please have a look.



was (Author: shashikant):
Thanks [~anu] , for the review comments.
As per discussion with [~anu], here are few conclusions:
1)
code {}
//make sure all the data in the ChunkoutputStreams is written to the
  //  container
  Preconditions.checkArgument(
  semaphore.availablePermits() == getMaxOutstandingChunks());
}
code {}

While doing close on the groupOutputStream, we do chunkOutputstream.close, 
where we do future.get() on response obtained after the write completes from 
the xceiver server which makes sure the response is received from the xceiver 
server. While closing the groupStream, semaphorePermiCount should be equal to 
no of available permits which is equal to max no of outstanding chunks at any 
given point of time.


2.  code {}
throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
}
code {}
Hardcoding the exception when the writeChunkToConatiner calls completes in the 
xceiverServer shows that , the exception is caught in the 
chunkoutputGroupStream.close path which is expected.

code{}
  response = response.thenApply(reply -> {
try{
  throw new IOException("Exception while validating response");
 // ContainerProtocolCalls.validateContainerResponse(reply);
 // return reply;
}catch (IOException e){
  throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e)
code{}
java.io.IOException: Unexpected Storage Container Exception: 
java.util.concurrent.ExecutionException: java.io.IOException: Exception while 
validating response

  at 
org.apache.hadoop.scm.storage.ChunkOutputStream.close(ChunkOutputStream.java:174)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream$ChunkOutputStreamEntry.close(ChunkGroupOutputStream.java:468)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.close(ChunkGroupOutputStream.java:291)
code{}

This is as expected.  Idea was to write a mocktest while 
validatingContainerResposne calls which is static method of a final class, and 
this requires powerMockrunner which leads to issues while bringing up the 
miniOzoneCluster.Will address the unit test to vertify the same later in a 
different jira.

Patch v3 addresses the remaining review comments.
[~anu]/others, please have a look.


> Ozone: Parallelize ChunkOutputSream 

[jira] [Comment Edited] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container

2017-11-20 Thread Shashikant Banerjee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259728#comment-16259728
 ] 

Shashikant Banerjee edited comment on HDFS-12794 at 11/20/17 7:50 PM:
--

Thanks [~anu] , for the review comments.
As per discussion with [~anu], here are few conclusions:
1)
code {}
//make sure all the data in the ChunkoutputStreams is written to the
  //  container
  Preconditions.checkArgument(
  semaphore.availablePermits() == getMaxOutstandingChunks());
}
code {}

While doing close on the groupOutputStream, we do chunkOutputstream.close, 
where we do future.get() on response obtained after the write completes from 
the xceiver server which makes sure the response is received from the xceiver 
server. While closing the groupStream, semaphorePermiCount should be equal to 
no of available permits which is equal to max no of outstanding chunks at any 
given point of time.


2.  code {}
throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
}
code {}
Hardcoding the exception when the writeChunkToConatiner calls completes in the 
xceiverServer shows that , the exception is caught in the 
chunkoutputGroupStream.close path which is expected.

code{}
  response = response.thenApply(reply -> {
try{
  throw new IOException("Exception while validating response");
 // ContainerProtocolCalls.validateContainerResponse(reply);
 // return reply;
}catch (IOException e){
  throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e)
code{}
java.io.IOException: Unexpected Storage Container Exception: 
java.util.concurrent.ExecutionException: java.io.IOException: Exception while 
validating response

  at 
org.apache.hadoop.scm.storage.ChunkOutputStream.close(ChunkOutputStream.java:174)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream$ChunkOutputStreamEntry.close(ChunkGroupOutputStream.java:468)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.close(ChunkGroupOutputStream.java:291)
code{}

This is as expected.  Idea was to write a mocktest while 
validatingContainerResposne calls which is static method of a final class, and 
this requires powerMockrunner which leads to issues while bringing up the 
miniOzoneCluster.Will address the unit test to vertify the same later in a 
different jira.

Patch v3 addresses the remaining review comments.
[~anu]/others, please have a look.



was (Author: shashikant):
Thanks [~anu] , for the review comments.
As per discussion with [~anu], here are few conclusions:
1)
code {
//make sure all the data in the ChunkoutputStreams is written to the
  //  container
  Preconditions.checkArgument(
  semaphore.availablePermits() == getMaxOutstandingChunks());
}

While doing close on the groupOutputStream, we do chunkOutputstream.close, 
where we do future.get() on response obtained after the write completes from 
the xceiver server which makes sure the response is received from the xceiver 
server. While closing the groupStream, semaphorePermiCount should be equal to 
no of available permits which is equal to max no of outstanding chunks at any 
given point of time.


2.  code {
throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
}

Hardcoding the exception when the writeChunkToConatiner calls completes in the 
xceiverServer shows that , the exception is caught in the 
chunkoutputGroupStream.close path which is expected.

Code {
  response = response.thenApply(reply -> {
try{
  throw new IOException("Exception while validating response");
 // ContainerProtocolCalls.validateContainerResponse(reply);
 // return reply;
}catch (IOException e){
  throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e)

java.io.IOException: Unexpected Storage Container Exception: 
java.util.concurrent.ExecutionException: java.io.IOException: Exception while 
validating response

  at 
org.apache.hadoop.scm.storage.ChunkOutputStream.close(ChunkOutputStream.java:174)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream$ChunkOutputStreamEntry.close(ChunkGroupOutputStream.java:468)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.close(ChunkGroupOutputStream.java:291)
}

This is as expected.  Idea was to write a mocktest while 
validatingContainerResposne calls which is static method of a final class, and 
this requires powerMockrunner which leads to issues while bringing up the 
miniOzoneCluster.Will address the unit test to vertify the same later in a 
different jira.

Patch v3 addresses the remaining review comments.
[~anu]/others, please have a look.


> Ozone: Parallelize ChunkOutputSream Writes to container
> 

[jira] [Comment Edited] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container

2017-11-20 Thread Shashikant Banerjee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259728#comment-16259728
 ] 

Shashikant Banerjee edited comment on HDFS-12794 at 11/20/17 7:45 PM:
--

Thanks [~anu] , for the review comments.
As per discussion with [~anu], here are few conclusions:
1)
code {
//make sure all the data in the ChunkoutputStreams is written to the
  //  container
  Preconditions.checkArgument(
  semaphore.availablePermits() == getMaxOutstandingChunks());
}

While doing close on the groupOutputStream, we do chunkOutputstream.close, 
where we do future.get() on response obtained after the write completes from 
the xceiver server which makes sure the response is received from the xceiver 
server. While closing the groupStream, semaphorePermiCount should be equal to 
no of available permits which is equal to max no of outstanding chunks at any 
given point of time.


2.  code {
throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
}

Hardcoding the exception when the writeChunkToConatiner calls completes in the 
xceiverServer shows that , the exception is caught in the 
chunkoutputGroupStream.close path which is expected.

Code {
  response = response.thenApply(reply -> {
try{
  throw new IOException("Exception while validating response");
 // ContainerProtocolCalls.validateContainerResponse(reply);
 // return reply;
}catch (IOException e){
  throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e)

java.io.IOException: Unexpected Storage Container Exception: 
java.util.concurrent.ExecutionException: java.io.IOException: Exception while 
validating response

  at 
org.apache.hadoop.scm.storage.ChunkOutputStream.close(ChunkOutputStream.java:174)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream$ChunkOutputStreamEntry.close(ChunkGroupOutputStream.java:468)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.close(ChunkGroupOutputStream.java:291)
}

This is as expected.  Idea was to write a mocktest while 
validatingContainerResposne calls which is static method of a final class, and 
this requires powerMockrunner which leads to issues while bringing up the 
miniOzoneCluster.Will address the unit test to vertify the same later in a 
different jira.

Patch v3 addresses the remaining review comments.
[~anu]/others, please have a look.



was (Author: shashikant):
Thanks [~anu] , for the review comments.
As per discussion with [~anu], here are few conclusions:
1)
code {
//make sure all the data in the ChunkoutputStreams is written to the
  //  container
  Preconditions.checkArgument(
  semaphore.availablePermits() == getMaxOutstandingChunks());
}

While doing close on the groupOutputStream, we do chunkOutputstream.close, 
where we do future.get() on response obtained after the write completes from 
the xceiver server which makes sure the response is received from the xceiver 
server. While closing the groupStream, semaphorePermiCount should be equal to 
no of available permits which is equal to max no of outstanding chunks at any 
given point of time.


2.  code {
throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
}

Hardcoding the exception when the writeChunkToConatiner calls completes in the 
xceiverServer shows that , the exception is caught in the 
chunkoutputGroupStream.close path which is expected.

Code {
  response = response.thenApply(reply -> {
try {
  throw new IOException("Exception while validating response");
 // ContainerProtocolCalls.validateContainerResponse(reply);
 // return reply;
} catch (IOException e) {
  throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
} 


java.io.IOException: Unexpected Storage Container Exception: 
java.util.concurrent.ExecutionException: java.io.IOException: Exception while 
validating response

  at 
org.apache.hadoop.scm.storage.ChunkOutputStream.close(ChunkOutputStream.java:174)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream$ChunkOutputStreamEntry.close(ChunkGroupOutputStream.java:468)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.close(ChunkGroupOutputStream.java:291)
}

This is as expected.  Idea was to write a mocktest while 
validatingContainerResposne calls which is static method of a final class, and 
this requires powerMockrunner which leads to issues while bringing up the 
miniOzoneCluster.Will address the unit test to vertify the same later in a 
different jira.

Patch v3 addresses the remaining review comments.
[~anu]/others, please have a look.


> Ozone: Parallelize ChunkOutputSream Writes to container
> 

[jira] [Comment Edited] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container

2017-11-20 Thread Shashikant Banerjee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259728#comment-16259728
 ] 

Shashikant Banerjee edited comment on HDFS-12794 at 11/20/17 7:43 PM:
--

Thanks [~anu] , for the review comments.
As per discussion with [~anu], here are few conclusions:
1)
code {
//make sure all the data in the ChunkoutputStreams is written to the
  //  container
  Preconditions.checkArgument(
  semaphore.availablePermits() == getMaxOutstandingChunks());
}

While doing close on the groupOutputStream, we do chunkOutputstream.close, 
where we do future.get() on response obtained after the write completes from 
the xceiver server which makes sure the response is received from the xceiver 
server. While closing the groupStream, semaphorePermiCount should be equal to 
no of available permits which is equal to max no of outstanding chunks at any 
given point of time.


2.  code {
throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
}

Hardcoding the exception when the writeChunkToConatiner calls completes in the 
xceiverServer shows that , the exception is caught in the 
chunkoutputGroupStream.close path which is expected.

Code {
  response = response.thenApply(reply -> {
try {
  throw new IOException("Exception while validating response");
 // ContainerProtocolCalls.validateContainerResponse(reply);
 // return reply;
} catch (IOException e) {
  throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
} 


java.io.IOException: Unexpected Storage Container Exception: 
java.util.concurrent.ExecutionException: java.io.IOException: Exception while 
validating response

  at 
org.apache.hadoop.scm.storage.ChunkOutputStream.close(ChunkOutputStream.java:174)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream$ChunkOutputStreamEntry.close(ChunkGroupOutputStream.java:468)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.close(ChunkGroupOutputStream.java:291)
}

This is as expected.  Idea was to write a mocktest while 
validatingContainerResposne calls which is static method of a final class, and 
this requires powerMockrunner which leads to issues while bringing up the 
miniOzoneCluster.Will address the unit test to vertify the same later in a 
different jira.

Patch v3 addresses the remaining review comments.
[~anu]/others, please have a look.



was (Author: shashikant):
Thanks [~anu] , for the review comments.
As per discussion with [~anu], here are few conclusions:
1)
code {
//make sure all the data in the ChunkoutputStreams is written to the
  //  container
  Preconditions.checkArgument(
  semaphore.availablePermits() == getMaxOutstandingChunks());
}

# While doing close on the groupOutputStream, we do chunkOutputstream.close, 
where we do future.get() on response obtained after the write completes from 
the xceiver server which makes sure the response is received from the xceiver 
server. While closing the groupStream, semaphorePermiCount should be equal to 
no of available permits which is equal to max no of outstanding chunks at any 
given pint of time.


2.  code {
throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
}

Hardcoding the exception when the writeChunkToConatiner calls completes in the 
sceiverServer shows that , the exception is caught in the 
chunkoutputGroupStream.close path which is expected.

Code {
 try {
  String requestID =
  traceID + chunkIndex + ContainerProtos.Type.WriteChunk.name();
  //add the chunk write traceId to the queue
  semaphore.acquire();
  LOG.warn("calling async");
  response =
  writeChunkAsync(xceiverClient, chunk, key, data, requestID);
  response = response.thenApply(reply -> {
try {
  throw new IOException("Exception while validating response");
 // ContainerProtocolCalls.validateContainerResponse(reply);
 // return reply;
} catch (IOException e) {
  LOG.info("coming here to throw exception");
  throw new CompletionException(
  "Unexpected Storage Container Exception: " + e.toString(), e);
} 
}

code {
java.io.IOException: Unexpected Storage Container Exception: 
java.util.concurrent.ExecutionException: java.io.IOException: Exception while 
validating response

  at 
org.apache.hadoop.scm.storage.ChunkOutputStream.close(ChunkOutputStream.java:174)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream$ChunkOutputStreamEntry.close(ChunkGroupOutputStream.java:468)
  at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.close(ChunkGroupOutputStream.java:291)
}

This is as expected.  Idea was to write a mocktest while 
validatingContainerResposne