[ 
https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151729#comment-15151729
 ] 

Mingliang Liu commented on HDFS-7661:
-------------------------------------

Thanks [~drankye] and [~sinall] for your prompt comments.

As I thought about the v2 design of this feature, I finally realized this is 
far from a simple patch. I totally welcome to join the collaborate effort of 
speeding it up as this is of high priority among other EC sub-tasks (as 
discussed in [HDFS-9603]).

I agree with [~drankye] that we need settle down the design and approach first. 
After this is well discussed, we can separate the work in different small 
components. For write it includes:
* client side hflush (mainly in {{DFSStripedOutputStream#flushOrSync}})
* DN receiving the packets (mainly in {{BlockReceiver#receivePacket}})
* DN appending, committing (and parsing for read) the BG length to meta file
* Fsdataset operations in DN to support file overwrite (as commented by 
[~sinall])

Meanwhile, for read request, 
* parity DN calculating its safe length
* the client side computing the maximum visible BG length
* and the protocol in-between, e.g. the DN may be aware of the EC policy for 
calculating its safe length.

Moreover, block reconstruction also needs to support {{hflush}}-ed files, which 
is not yet covered by current design.

Last, we need to test the code thoroughly. The ideal case is that we are able 
to test each code segment independently, without involving too much context of 
other part. End-to-end test is needed for sure when we bring them together.

I must have missed something, I believe? As we discuss the design, my 
in-progress demo patch mainly focuses on the client 
{{DFSStripedOutputStream#flushOrSync}} and DN {{BlockReceiver#receivePacket}}. 
Thus [~sinall]'s code on overwriting support in fsdataset should be re-used. 
For read request, I don't have any code yet.

> Erasure coding: support hflush and hsync
> ----------------------------------------
>
>                 Key: HDFS-7661
>                 URL: https://issues.apache.org/jira/browse/HDFS-7661
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: GAO Rui
>         Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, 
> HDFS-7661-unitTest-wip-trunk.patch, 
> HDFS-EC-file-flush-sync-design-version1.1.pdf, 
> HDFS-EC-file-flush-sync-design-version2.0.pdf
>
>
> We also need to support hflush/hsync and visible length. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to