[
https://issues.apache.org/jira/browse/HDFS-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shashikant Banerjee updated HDFS-12794:
---------------------------------------
Attachment: HDFS-12794-HDFS-7240.007.patch
Thanks [~xyao], for the review comments.
{code}
Line 139-166: in the new code, the rollback will position the buffer to the
very beginning of all async operations even though some of them may succeed
while the previous sequential write allows partial rollback. Any thoughts on
the tradeoff here? The parallel steam write is beneficial only for certain use
cases (e.g., large files)?
{code}
I agree that the previous code had the flexibility of partial rollback which
the current patch doesn't have.
{code}
public void write(byte b[], int off, int len) throws IOException {
{code}
If the write over an java output stream fails , write request from the client
fails and accordingly the client can retry the write op starting from the
original offset. In case of an exception during the write on the stream,
rollback to initial offset should work.
Given that the write API is void in nature, we are not letting the client know
in any case at what exact point, the failure happened. Partial rollback can
benefit in case the code retries the to write the chunk internally in case of
certain failures which is not the case here. In such situations, rollback to
the initial offset should be the usual action. What do you think ?
{code}
Line 337: should we put offset here or remove the offset parameter assuming
this always starts with ByteBuffer offset 0.
{code}
ChunkInfo contains the filed offset which is used in a lot of places. I think
we can keep it here,
and do the change in a different jira if at all this is required.
Patch v7 addresses the remaining review comments. Please have a look.
> Ozone: Parallelize ChunkOutputSream Writes to container
> -------------------------------------------------------
>
> Key: HDFS-12794
> URL: https://issues.apache.org/jira/browse/HDFS-12794
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Affects Versions: HDFS-7240
> Reporter: Shashikant Banerjee
> Assignee: Shashikant Banerjee
> Fix For: HDFS-7240
>
> Attachments: HDFS-12794-HDFS-7240.001.patch,
> HDFS-12794-HDFS-7240.002.patch, HDFS-12794-HDFS-7240.003.patch,
> HDFS-12794-HDFS-7240.004.patch, HDFS-12794-HDFS-7240.005.patch,
> HDFS-12794-HDFS-7240.006.patch, HDFS-12794-HDFS-7240.007.patch
>
>
> The chunkOutPutStream Write are sync in nature .Once one chunk of data gets
> written, the next chunk write is blocked until the previous chunk is written
> to the container.
> The ChunkOutputWrite Stream writes should be made async and Close on the
> OutputStream should ensure flushing of all dirty buffers to the container.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]