[ https://issues.apache.org/jira/browse/HDFS-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319692#comment-14319692 ]
Zhe Zhang commented on HDFS-7729: --------------------------------- [~libo-intel] Thank you for taking on the refactor task! I think it's a great opportunity to "preview" the quality requirement when we eventually merge the branch to trunk. Just to clarify, I think the current patch (008) already has the [3 main steps | https://issues.apache.org/jira/browse/HDFS-7729?focusedCommentId=14319136&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14319136] required in striped writing. After the HDFS-7793 refactor, I think we'll be able to leverage a major portion of it. In subclassing {{DFSOutputStream}} we should try to minimize duplicate code. For example, we should ideally append striping logic in override methods: {code} // DFSOutputStreamStriped @Override protected synchronized void writeChunk(byte[] b, int offset, int len, byte[] checksum, int ckoff, int cklen) throws IOException { addToCellBuffer(); super.writeChunk(); // Two extra steps are needed when a striping cell is full: // 1. Forward the current index pointer // 2. Generate parity packets if a full stripe of data cells are present ... } {code} Again, great work so far by [~libo-intel] and reviewers. Let's take the time and build the right client code for EC. > Add logic to DFSOutputStream to support writing a file in striping layout > -------------------------------------------------------------------------- > > Key: HDFS-7729 > URL: https://issues.apache.org/jira/browse/HDFS-7729 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Li Bo > Assignee: Li Bo > Attachments: Codec-tmp.patch, HDFS-7729-001.patch, > HDFS-7729-002.patch, HDFS-7729-003.patch, HDFS-7729-004.patch, > HDFS-7729-005.patch, HDFS-7729-006.patch, HDFS-7729-007.patch, > HDFS-7729-008.patch > > > If client wants to directly write a file striping layout, we need to add some > logic to DFSOutputStream. DFSOutputStream needs multiple DataStreamers to > write each cell of a stripe to a remote datanode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)