[
https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151729#comment-15151729
]
Mingliang Liu commented on HDFS-7661:
-------------------------------------
Thanks [~drankye] and [~sinall] for your prompt comments.
As I thought about the v2 design of this feature, I finally realized this is
far from a simple patch. I totally welcome to join the collaborate effort of
speeding it up as this is of high priority among other EC sub-tasks (as
discussed in [HDFS-9603]).
I agree with [~drankye] that we need settle down the design and approach first.
After this is well discussed, we can separate the work in different small
components. For write it includes:
* client side hflush (mainly in {{DFSStripedOutputStream#flushOrSync}})
* DN receiving the packets (mainly in {{BlockReceiver#receivePacket}})
* DN appending, committing (and parsing for read) the BG length to meta file
* Fsdataset operations in DN to support file overwrite (as commented by
[~sinall])
Meanwhile, for read request,
* parity DN calculating its safe length
* the client side computing the maximum visible BG length
* and the protocol in-between, e.g. the DN may be aware of the EC policy for
calculating its safe length.
Moreover, block reconstruction also needs to support {{hflush}}-ed files, which
is not yet covered by current design.
Last, we need to test the code thoroughly. The ideal case is that we are able
to test each code segment independently, without involving too much context of
other part. End-to-end test is needed for sure when we bring them together.
I must have missed something, I believe? As we discuss the design, my
in-progress demo patch mainly focuses on the client
{{DFSStripedOutputStream#flushOrSync}} and DN {{BlockReceiver#receivePacket}}.
Thus [~sinall]'s code on overwriting support in fsdataset should be re-used.
For read request, I don't have any code yet.
> Erasure coding: support hflush and hsync
> ----------------------------------------
>
> Key: HDFS-7661
> URL: https://issues.apache.org/jira/browse/HDFS-7661
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Tsz Wo Nicholas Sze
> Assignee: GAO Rui
> Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png,
> HDFS-7661-unitTest-wip-trunk.patch,
> HDFS-EC-file-flush-sync-design-version1.1.pdf,
> HDFS-EC-file-flush-sync-design-version2.0.pdf
>
>
> We also need to support hflush/hsync and visible length.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)