[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529866#comment-14529866 ] Zhe Zhang commented on HDFS-7859: - Cool! I just registered. Thanks for organizing it Allen. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Labels: BB2015-05-TBR Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529858#comment-14529858 ] Zhe Zhang commented on HDFS-7859: - [~aw] Could you explain a bit what this {{BB2015-05-TBR}} label means? Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Labels: BB2015-05-TBR Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529562#comment-14529562 ] Kai Zheng commented on HDFS-7859: - Nicholas, This was already move out of HDFS-7285 you did and there was no plan to commit this in phase I AFAIK. I thought the patch updated here is good to have to be ready for follow-on once we get the merge done. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529859#comment-14529859 ] Allen Wittenauer commented on HDFS-7859: It means you haven't been paying attention to the bug bash emails. :) Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Labels: BB2015-05-TBR Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529040#comment-14529040 ] Tsz Wo Nicholas Sze commented on HDFS-7859: --- Please do not commit this JIRA to the HDFS-7285 branch since we won't support multiple schemas for the moment. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521029#comment-14521029 ] Xinwei Qin commented on HDFS-7859: --- The 003 patch removes MODIFY and REMOVE ECSchema editlog operations, these operations will be added by another JIRA(HDFS-8295) later when they are supported. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521376#comment-14521376 ] Hadoop QA commented on HDFS-7859: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 37s | Pre-patch HDFS-7285 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 40s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 7m 48s | The applied patch generated 10 additional checkstyle issues. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 11s | The patch appears to introduce 9 new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 15s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 239m 34s | Tests failed in hadoop-hdfs. | | | | 288m 5s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | | Inconsistent synchronization of org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time Unsynchronized access at DFSOutputStream.java:90% of time Unsynchronized access at DFSOutputStream.java:[line 142] | | | Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived from an Exception, even though it is named as such At DataStreamer.java:from an Exception, even though it is named as such At DataStreamer.java:[lines 177-201] | | | Dead store to offSuccess in org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() At StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() At StripedDataStreamer.java:[line 105] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:[line 208] | | | Possible null pointer dereference of arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] | | | Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema):in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema): String.getBytes() At ErasureCodingZoneManager.java:[line 116] | | | Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath): new String(byte[]) At ErasureCodingZoneManager.java:[line 81] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:[line 85] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, int, int) At StripedBlockUtil.java:to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, int, int) At StripedBlockUtil.java:[line 167] | | Failed unit tests | hadoop.hdfs.server.namenode.TestMetadataVersionOutput | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.server.namenode.TestCheckpoint | | | hadoop.hdfs.TestDFSOutputStream | | | hadoop.hdfs.TestDFSRollback | | | hadoop.hdfs.server.namenode.TestCreateEditsLog | | | hadoop.hdfs.protocol.TestLayoutVersion | | | hadoop.hdfs.TestDFSFinalize | | | hadoop.hdfs.server.namenode.TestDeleteRace | | |
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518754#comment-14518754 ] Zhe Zhang commented on HDFS-7859: - [~aw] Thanks again for bringing in the feature-branch pre-commit Jenkins functionality! It's really helpful. We just saw another successful run under HDFS-7678. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517414#comment-14517414 ] Allen Wittenauer commented on HDFS-7859: P.S., thanks for letting me use this issue as a guinea pig. :D Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517393#comment-14517393 ] Allen Wittenauer commented on HDFS-7859: bq. 263m 35s Youch. Just under the wire. bq. git revisionHDFS-7285 / bc3091b (y) So yes, it switched to the feature branch to run the tests, as was expected. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516609#comment-14516609 ] Hadoop QA commented on HDFS-7859: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 2s | Pre-patch HDFS-7285 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 2 line(s) that end in whitespace. | | {color:green}+1{color} | javac | 7m 42s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 49s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 14s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 5m 48s | The applied patch generated 10 additional checkstyle issues. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 18s | The patch appears to introduce 11 new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 16s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 216m 12s | Tests failed in hadoop-hdfs. | | | | 263m 35s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | | Inconsistent synchronization of org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 89% of time Unsynchronized access at DFSOutputStream.java:89% of time Unsynchronized access at DFSOutputStream.java:[line 142] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.DFSStripedInputStream.planReadPortions(int, int, long, int, int) At DFSStripedInputStream.java:to long in org.apache.hadoop.hdfs.DFSStripedInputStream.planReadPortions(int, int, long, int, int) At DFSStripedInputStream.java:[line 95] | | | Dead store to offSuccess in org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() At StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() At StripedDataStreamer.java:[line 104] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:[line 208] | | | Possible null pointer dereference of arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] | | | Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema):in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema): String.getBytes() At ErasureCodingZoneManager.java:[line 116] | | | Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath): new String(byte[]) At ErasureCodingZoneManager.java:[line 81] | | | org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$AddECSchemaOp.toString() makes inefficient use of keySet iterator instead of entrySet iterator At FSEditLogOp.java:keySet iterator instead of entrySet iterator At FSEditLogOp.java:[line 4552] | | | org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$ModifyECSchemaOp.toString() makes inefficient use of keySet iterator instead of entrySet iterator At FSEditLogOp.java:keySet iterator instead of entrySet iterator At FSEditLogOp.java:[line 4624] | | | org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.writeECSchema(DataOutputStream, ECSchema) makes inefficient use of keySet iterator instead of entrySet iterator At FSImageSerialization.java:of keySet iterator instead of entrySet iterator At FSImageSerialization.java:[line 792] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:to long in
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14515030#comment-14515030 ] Allen Wittenauer commented on HDFS-7859: test-patch.sh reads the name of the patch, not any of the JIRA metadata. So if the patch is named something generic, it thinks it is trunk. See HowToContribute for the official rules, but as you can see from the name of the patch above, it knows about a few different methods to name them. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514890#comment-14514890 ] Zhe Zhang commented on HDFS-7859: - [~aw] I quickly went through HDFS-7285 sub tasks. If you'd like you can try with HDFS-8236. I actually tried with HDFS-8033 earlier but it still tried to apply the patch against trunk. Maybe it's because I didn't set target version to HDFS-7285 _when submitting patch_. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514233#comment-14514233 ] Allen Wittenauer commented on HDFS-7859: (now we just need a submit button. lol) Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516309#comment-14516309 ] Allen Wittenauer commented on HDFS-7859: FYI, there are now two of these running: https://builds.apache.org/job/PreCommit-HDFS-Build/10424/console https://builds.apache.org/job/PreCommit-HDFS-Build/10425/console It's still churning through hadoop-hdfs unit tests on the one that [~xinwei] submitted earlier. hadoop-hdfs is one of the slowest set of unit tests we have. I have a hunch that you folks have added code in this branch which has made it even slower ... to the point that Jenkins will likely kill the test patch job before it finishes. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514718#comment-14514718 ] Allen Wittenauer commented on HDFS-7859: I did some playing with a test jira this morning. IIRC, It looks like submit patch is only available to the requester and the assignee when the jira is in the 'in progress' status. The 'in progress' status can only be changed by the assignee and/or the requester. I then thought well, I'll force it through jenkins... but test-patch.sh is smart in that it will only process jiras that are in patch available status. So while I could have changed the meta info in the JIRA to force it to kick off, I didn't want to freak anyone out more than I already had by popping up in here. I thought it was going to be an easy/quick test. :( Running test-patch.sh as a developer against this JIRA # *does* run it against the HDFS-7285 branch though, as expected. :D (I had tested patches against branch-2, but hadn't had a chance to test against a dev branch... so this updated last night and thought it'd be a good guinea pig) Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514629#comment-14514629 ] Zhe Zhang commented on HDFS-7859: - Thanks Allen. Do you know why Submit Patch isn't available here? Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516433#comment-14516433 ] Hadoop QA commented on HDFS-7859: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 23s | Pre-patch HDFS-7285 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 2 line(s) that end in whitespace. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 41s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 16s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 4m 1s | The applied patch generated 9 additional checkstyle issues. | | {color:green}+1{color} | install | 1m 50s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 13s | The patch appears to introduce 11 new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 17s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 217m 55s | Tests failed in hadoop-hdfs. | | | | 263m 53s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | | Inconsistent synchronization of org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 89% of time Unsynchronized access at DFSOutputStream.java:89% of time Unsynchronized access at DFSOutputStream.java:[line 142] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.DFSStripedInputStream.planReadPortions(int, int, long, int, int) At DFSStripedInputStream.java:to long in org.apache.hadoop.hdfs.DFSStripedInputStream.planReadPortions(int, int, long, int, int) At DFSStripedInputStream.java:[line 95] | | | Dead store to offSuccess in org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() At StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() At StripedDataStreamer.java:[line 104] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:[line 208] | | | Possible null pointer dereference of arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] | | | Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema):in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema): String.getBytes() At ErasureCodingZoneManager.java:[line 116] | | | Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath): new String(byte[]) At ErasureCodingZoneManager.java:[line 81] | | | org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$AddECSchemaOp.toString() makes inefficient use of keySet iterator instead of entrySet iterator At FSEditLogOp.java:keySet iterator instead of entrySet iterator At FSEditLogOp.java:[line 4552] | | | org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$ModifyECSchemaOp.toString() makes inefficient use of keySet iterator instead of entrySet iterator At FSEditLogOp.java:keySet iterator instead of entrySet iterator At FSEditLogOp.java:[line 4624] | | | org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.writeECSchema(DataOutputStream, ECSchema) makes inefficient use of keySet iterator instead of entrySet iterator At FSImageSerialization.java:of keySet iterator instead of entrySet iterator At FSImageSerialization.java:[line 792] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:to long in
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498895#comment-14498895 ] Tsz Wo Nicholas Sze commented on HDFS-7859: --- HDFS-8062 does note require this since default schema can be hard coded. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498847#comment-14498847 ] Kai Zheng commented on HDFS-7859: - [~szetszwo] I don't have much time to sort the complete list yet but thought HDFS-8062 would be one. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497306#comment-14497306 ] Kai Zheng commented on HDFS-7859: - [~szetszwo], bq.Since we don't not yet support add/delete/update/rename schema operations, we don't need to persist anything in NN at this moment. We will support some of these schema operations down the road. We may persist schemas at that time. Sound good? Please note it's not true we don't need to persist anything in NN at this moment.. We had already persisted some hard-coded values that should be covered by a schema in the image. Without this, we will definitely need to revisit the image format change some time later. As I said above, it's flexible enough in the schema definition and if we persist the whole schema object in image, we would not likely need to change the image later. Please note this issue blocks many subsequent issues and I thought we still have enough time for them right before the merge happening. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497447#comment-14497447 ] Tsz Wo Nicholas Sze commented on HDFS-7859: --- ... We had already persisted some hard-coded values that should be covered by a schema ... What do you mean? Could you give an example? ... Please note this issue blocks many subsequent issues and I thought we still have enough time for them right before the merge happening. What are the subsequent issues? Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497485#comment-14497485 ] Kai Zheng commented on HDFS-7859: - bq.What do you mean? Could you give an example? Well, my last said was bad and inaccurate. After double checking related codes, I saw only stripped blocks derived from the following hard-coded values are persisted in the image. So please ignore the saying. bq.What are the subsequent issues? We do have some and will sort them out later. I have opened HDFS-8156 to resolve some deps caused by HDFS-7866, originally planned to be done here. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495853#comment-14495853 ] Kai Zheng commented on HDFS-7859: - Hi [~szetszwo], Per your request I updated the doc in HDFS-7337 accordingly. It entirely rewrote the schema section and mainly reflects existing related discussions and even implementations. I wish it addresses your questions here well. Your further comments and questions are very welcome. Thanks in advance! Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493702#comment-14493702 ] Xinwei Qin commented on HDFS-7859: --- OK, I will track it. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493695#comment-14493695 ] Kai Zheng commented on HDFS-7859: - Note I have updated the patch in HDFS-7866 aligning with this. When it's getting in then this one can rebase and be in then. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493699#comment-14493699 ] Xinwei Qin commented on HDFS-7859: --- [~drankye], thanks for your comments. {quote} 1. Looks like this couples with HDFS-7866. Maybe I could commit HDFS-7866 first and then this gets all the left work done. Will it work for you this way? {quote} Yes, committing HDFS-7866 first is better. bq. 2. What methods can ECSchemaManager call to make it happen? Some methods like {{logAddECSchema()}} in {{FSEditLog.java}} are missing, I will add them in next patch. bq. 3. In ECSchemaManager, new methods like addECSchema are not necessarily public. I will change to friendly. bq. 4. Are we supporting the two formats? Please add Javadoc to explain them, thanks. Yes, two formats are supported. These methods are all only called during namenode startup or do checkpoint, and which method is called depends on the FSImage format. I will add detail Javadoc on them. bq. 5. Would you have separate issue(s) for the following? I will create a new issue for it. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493703#comment-14493703 ] Xinwei Qin commented on HDFS-7859: --- OK, I will track it. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493704#comment-14493704 ] Xinwei Qin commented on HDFS-7859: --- OK, I will track it. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495498#comment-14495498 ] Tsz Wo Nicholas Sze commented on HDFS-7859: --- The patch under this JIRA handles saving / loading these default schemas in fsimage. I think this is necessary even without loading custom schemas from XML. Otherwise we cannot guarantee the NameNode which loads the fsimage has the same default schemas as the NameNode which saved it. It is obviously even more necessary when we add custom schemas ... I think we should not persist anything to NN before we have a clear design since we don't know what to persist. For example, should we persist schema ID? We are not able to answer this question since we don't even know if a schema should have an ID. If we change the layout later on, it requires cluster upgrade for the new layout and we have to support the old layout for backward compatibility. For now, I suggest to just hard code the only (6,3)-Reed-Solomon schema. We don't even need the xml file. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495546#comment-14495546 ] Tsz Wo Nicholas Sze commented on HDFS-7859: --- ... schema name for the ID purpose. ... There are a few choice choices: # Using schema name as ID # A schema name and a separated numeric ID # Multiple schema names and a numeric ID Why using #1? Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495553#comment-14495553 ] Tsz Wo Nicholas Sze commented on HDFS-7859: --- ... We would persist the whole schema object ... How can we be sure that the schema object format won't change? Since we don't not yet support add/delete/update/rename schema operations, we don't need to persist anything in NN at this moment. We will support some of these schema operations down the road. We may persist schemas at that time. Sound good? Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495442#comment-14495442 ] Zhe Zhang commented on HDFS-7859: - [~szetszwo] / [~drankye]: The [phasing plan | https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14391207page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391207] I posted might be a little confusing in regards of schemas. My apologies. In the offline meetup on 03/31, we didn't reach a clear conclusion on how much of schema work to include before merging. Therefore I left it in phase I, but marked it as optional. My thought was that we could make a better decision after observing how fast the work could proceed. Up to this point I think this thread is going pretty well and it seems we can have a multi-schema implementation when other HDFS-7285 tasks are done (see details below). Good [questions | https://issues.apache.org/jira/browse/HDFS-7859?focusedCommentId=14494933page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14494933] on schema design. I think we eventually need to answer them in the broader scope of HDFS-7337. IIUC HDFS-7859 / HDFS-7866 are not touching most of the tricky scenarios. Based on Kai's latest [comment | https://issues.apache.org/jira/browse/HDFS-7866?focusedCommentId=14494050page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14494050], HDFS-7866 will mostly handle _default_ schemas embedded in the {{ECSchemaManager}} code. The patch under this JIRA handles saving / loading these default schemas in fsimage. I think this is necessary even without loading custom schemas from XML. Otherwise we cannot guarantee the NameNode which loads the fsimage has the same default schemas as the NameNode which saved it. It is obviously even more necessary when we add custom schemas. The logic in the patch is quite straightforward; it's mostly about serialize / deserialize schemas. So here's my proposal: # Shrink this patch to get rid of logics on modifying and removing schemas ({{ECSchemaManager#modifyECSchema}} and {{OP_MODIFY_EC_SCHEMA}}). # Repurpose HDFS-7866 to focus on loading custom schemas from site xml files. [~szetszwo], [~drankye], [~vinayrpet]: let me know if you agree with the above. If we are all synced on this, how about moving this JIRA back to HDFS-7285 and keeping HDFS-7866 under HDFS-8031? Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495556#comment-14495556 ] Kai Zheng commented on HDFS-7859: - bq.Using schema name as ID As we would not make it heavy so don't have some field like {{description}} for an {{ECSchema}}, a friendly name like {{RS-6-3}} would make it more sense in the way rather than an number ID. Users should be clearly understand the schema before using it to create any zone. The name will help with identifying that. bq.We don't even need the xml file. Yeah, if we would do that thru command to define a schema by specifying the schema parameters, it should also be OK. I don't have strongly preference about that. Any file format or even not using file would also work I guess. We talked about this in the meetup, looks like XML file was synced. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495493#comment-14495493 ] Kai Zheng commented on HDFS-7859: - Hi [~zhz], Thanks for taking care of this and your good suggestion. It looks reasonable to me. This will sound like a more solid base for the merge. To summarize further: 1. This issue HDFS-7859 would provide two system defined schemas in Java codes: one is the system default schema (rs-6-3), already there; a new one, suggesting rs-10-4; It also ensure the two schemas will be persisted in the image/editlog for later querying. 2. The left gaps will be processed as follow-on to be done in HDFS-7866, mainly about how to customize site specific schemas thru a XML file. The design will also be updated. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495517#comment-14495517 ] Kai Zheng commented on HDFS-7859: - bq.we don't know what to persist. For example, should we persist schema ID? We are not able to answer this question since we don't even know if a schema should have an ID. It's not true. We have {{ECSchema}} defined and it uses schema name for the ID purpose. We would persist the whole schema object. The on-going work although isn't reflected in the design doc but we did do that following our related discussion. In the meetup with [~zhz] and [~jingzhao], we covered this aspect and even your questions already. It's my mistake I didn't put it down clearly and update the doc accordingly. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495560#comment-14495560 ] Kai Zheng commented on HDFS-7859: - bq.How can we be sure that the schema object format won't change? Good question. In {{ECSchema}} class, in addition to the common parameters widely used by typical erasure codecs, an {{options}} map is also included so potentially any complex codec can use it to contain its own specific parameters or key-value pairs, such parameters are subject to its corresponding erasure coders to interpret. We try to make it flexible enough to avoid such change, but in case it needs change anyway, I thought it's supported, I mean the image layout version. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494933#comment-14494933 ] Tsz Wo Nicholas Sze commented on HDFS-7859: --- I have the following questions: - How to add a schema? Using a command? - Then, is it possible to delete a schema? It seems that we have to support deletion since a schema may be created by mistake or there could be typos when creating a schema. - If deletion is supported, what to do with the existing files with that schema? - Do we support renaming schema? - Does a EC schema have a schema ID? I think we need a design for EC schema to answer all these questions and specify what operations are supported. BTW, we only support one schema (6,3)-Reed-Solomon in the first phase HDFS-7285. I think we should focus on finishing a complete, working basic EC feature and get HDFS-7285 merged to trunk. How about moving this JIRA and related JIRAs to HDFS-8031 and defer the work? Sorry for commenting on this late and thanks for all the good works. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495040#comment-14495040 ] Kai Zheng commented on HDFS-7859: - Hi [~szetszwo], thanks for taking care of this. These questions did be considered thru the related work. The overall design and discussion are in HDFS-7337, would you take a look at it. Let's discuss further there. I will sort out latest discussions and clearly answer your questions there. Thanks. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495049#comment-14495049 ] Tsz Wo Nicholas Sze commented on HDFS-7859: --- ... The overall design and discussion are in HDFS-7337, would you take a look at it. ... Yes, I looked at it earlier but it did not answer my questions. Since HDFS-7337 is already under HDFS-8031, let's move all the related works to HDFS-8031. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495059#comment-14495059 ] Kai Zheng commented on HDFS-7859: - HDFS-7337 is rather large, we're implementing its related tasks incrementally. In your view, what's the difficulty that makes this sub-task hard to be in the merge? Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495089#comment-14495089 ] Tsz Wo Nicholas Sze commented on HDFS-7859: --- I just want to make sure we have the right design before adding code. Persisting schema to fsimage is not an obvious task and it is not required in HDFS-7285. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493484#comment-14493484 ] Kai Zheng commented on HDFS-7859: - Thanks [~xinwei] for the great work! Some comments or questions after a quick look: 1. Looks like this couples with HDFS-7866. Maybe I could commit HDFS-7866 first and then this gets all the left work done. Will it work for you this way? 2. I guess when reloading of predefined xml happens, some schemas need to be updated/removed/added to editlog. What methods can ECSchemaManager call to make it happen? I did notice some OPs like OP_REMOVE_EC_SCHEMA are added, but where the OPs are triggered? 3. In {{ECSchemaManager}}, new methods like {{addECSchema}} are not necessarily public. 4. A question would you help with, it's not clear to me when to call {{loadECSchemas}} and when to call {{loadState}}. Are we supporting the two formats? Please add Javadoc to explain them, thanks. 5. Would you have separate issue(s) for the following? {code} @Override protected void toXml(ContentHandler contentHandler) throws SAXException { // TODO Support for offline EditsVistor over an OEV XML file } @Override void fromXml(Stanza st) throws InvalidXmlException { // TODO Support for offline EditsVistor over an OEV XML file } {code} Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492394#comment-14492394 ] Xinwei Qin commented on HDFS-7859: --- Hi [~drankye] The patch has been completed, but is a little big. I will post it about half an hour later at home. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14491942#comment-14491942 ] Kai Zheng commented on HDFS-7859: - Any update or question? Thanks. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487291#comment-14487291 ] Xinwei Qin commented on HDFS-7859: --- Hi [~drankye], Thanks for your clarification and suggestion. I'm more clear on this issue, and will post the patch ASAP. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487072#comment-14487072 ] Kai Zheng commented on HDFS-7859: - Hi [~xinwei], About persisting schema object, I guess the work in HDFS-8023 may be helpful for your reference, as pointed by [~vinayrpet] above. And would you take a look at the initial codes attached in HDFS-7866 and HDFS-8062 so have more clear idea about the scope of this issue? Thanks. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482766#comment-14482766 ] Vinayakumar B commented on HDFS-7859: - bq. When we have this work done, we may also have the idea about how to serialize/deserialize an EC schema in RPC between NameNode and client/DataNode This is included in the latest patch provided in HDFS-8023. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395186#comment-14395186 ] Kai Zheng commented on HDFS-7859: - When we have this work done, we may also have the idea about how to serialize/deserialize an EC schema in RPC between NameNode and client/DataNode. If necessary we may create another issue to handle the aspect. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395184#comment-14395184 ] Kai Zheng commented on HDFS-7859: - I would clarify here that this issue would focus on how to persist EC schemas in NameNode, as already agreed, in the fsimage and editlog. Or in other words, this issue is to change fsimage and editlog to persist EC schemas, and should consider relevant issues like the image version, how to upgrade/downgrade and etc. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393895#comment-14393895 ] Xinwei Qin commented on HDFS-7859: --- Hi, [~drankye], I'm interested in this issue, if you have no time to do, can reassign this to me. Thanks. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393901#comment-14393901 ] Kai Zheng commented on HDFS-7859: - Great, please take it. Thanks. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)