[jira] [Updated] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container

Shashikant Banerjee (JIRA) Mon, 08 Jan 2018 03:11:26 -0800

     [ 
https://issues.apache.org/jira/browse/HDFS-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shashikant Banerjee updated HDFS-12794:
---------------------------------------
    Attachment: HDFS-12794-HDFS-7240.007.patch

Thanks [~xyao], for the review comments.

{code}
Line 139-166: in the new code, the rollback will position the buffer to the 
very beginning of all async operations even though some of them may succeed 
while the previous sequential write allows partial rollback. Any thoughts on 
the tradeoff here? The parallel steam write is beneficial only for certain use 
cases (e.g., large files)?
{code}

I agree that the previous code had the flexibility of partial rollback which 
the current patch doesn't have.
{code}
public void write(byte b[], int off, int len) throws IOException {
{code}

If the write over an java output stream fails , write request from the client 
fails and accordingly the client can retry the write op starting from the 
original offset. In case of an exception during the write on the stream, 
rollback to initial offset should work.

Given  that the write API is void in nature, we are not letting the client know 
in any case at what exact point, the failure happened. Partial rollback can 
benefit in case the code retries the to write the chunk internally in case of 
certain failures which is not the case here. In such situations, rollback to 
the initial offset should be the usual action. What do you think ?

{code}
Line 337: should we put offset here or remove the offset parameter assuming 
this always starts with ByteBuffer offset 0.
{code}

ChunkInfo contains the filed offset which is used in a lot of places. I think 
we can keep it here,
and do the change in a different jira if at all this is required.

Patch v7 addresses the remaining review comments. Please have a look.


> Ozone: Parallelize ChunkOutputSream Writes to container
> -------------------------------------------------------
>
>                 Key: HDFS-12794
>                 URL: https://issues.apache.org/jira/browse/HDFS-12794
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Shashikant Banerjee
>            Assignee: Shashikant Banerjee
>             Fix For: HDFS-7240
>
>         Attachments: HDFS-12794-HDFS-7240.001.patch, 
> HDFS-12794-HDFS-7240.002.patch, HDFS-12794-HDFS-7240.003.patch, 
> HDFS-12794-HDFS-7240.004.patch, HDFS-12794-HDFS-7240.005.patch, 
> HDFS-12794-HDFS-7240.006.patch, HDFS-12794-HDFS-7240.007.patch
>
>
> The chunkOutPutStream Write are sync in nature .Once one chunk of data gets 
> written, the next chunk write is blocked until the previous chunk is written 
> to the container.
> The ChunkOutputWrite Stream writes should be made async and Close on the 
> OutputStream should ensure flushing of all dirty buffers to the container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container

Reply via email to