[ https://issues.apache.org/jira/browse/HADOOP-13665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15781079#comment-15781079 ]
Wei-Chiu Chuang edited comment on HADOOP-13665 at 12/27/16 7:16 PM: -------------------------------------------------------------------- Hi [~lewuathe], thank you for your updated patch, but I am sorry that I think I missed a few critical issues in the patch in my last review. First of all, ec codecs are specified without implementation -- For example, DFSStripedoutputStream called {{CodecUtil.createRawEncoder}} but currently all defined schemas specifies either rs-default, rs-legacy or xor. It should be intuitive that codecs are not implementation-specific, that is, a native rs-default codec implementation should be interoperable with a pure Java rs-default codec implementation. So adding a new codec like {{NATIVE_XOR_CODEC_NAME}} doesn't make sense. Second, it is potentially risky to assume native codec is _always_ preferred over Java codec implementation, for instance if native codec has a bug and we are forced to use Java codec until the native one is fixed. Lastly, can we follow the existing conventions used in in-transit encryption? For example, the default value of configuration key hadoop.security.crypto.codec.classes.aes.ctr.nopadding is org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec, org.apache.hadoop.crypto.JceAesCtrCryptoCodec which means OpensslAesCtrCryptoCodec is preferred over JceAesCtrCryptoCodec, but it is configurable. Also one nit: please use LOG.debug with curl bracket style is preferred for performance reasons: for example, instead of {code} LOG.debug("Creating raw encoder by using " + rawCoderFactoryKey); {code} use {code} LOG.debug("Creating raw encoder by using {}", rawCoderFactoryKey); {code} was (Author: jojochuang): Hi [~lewuathe], thank you for your updated patch, but I am sorry that I think I missed a few critical issues in the patch. First of all, ec codecs are specified without implementation -- For example, DFSStripedoutputStream called {{CodecUtil.createRawEncoder}} but currently all defined schemas specifies either rs-default, rs-legacy or xor. It should be intuitive that codecs are not implementation-specific, that is, a native rs-default codec implementation should be interoperable with a pure Java rs-default codec implementation. So adding a new codec like {{NATIVE_XOR_CODEC_NAME}} doesn't make sense. Second, it is potentially risky to assume native codec is _always_ preferred over Java codec implementation, for instance if native codec has a bug and we are forced to use Java codec until the native one is fixed. Lastly, can we follow the existing conventions used in in-transit encryption? For example, the default value of configuration key hadoop.security.crypto.codec.classes.aes.ctr.nopadding is org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec, org.apache.hadoop.crypto.JceAesCtrCryptoCodec which means OpensslAesCtrCryptoCodec is preferred over JceAesCtrCryptoCodec, but it is configurable. Also one nit: please use LOG.debug with curl bracket style is preferred for performance reasons: for example, instead of {code} LOG.debug("Creating raw encoder by using " + rawCoderFactoryKey); {code} use {code} LOG.debug("Creating raw encoder by using {}", rawCoderFactoryKey); {code} > Erasure Coding codec should support fallback coder > -------------------------------------------------- > > Key: HADOOP-13665 > URL: https://issues.apache.org/jira/browse/HADOOP-13665 > Project: Hadoop Common > Issue Type: Sub-task > Components: io > Reporter: Wei-Chiu Chuang > Assignee: Kai Sasaki > Labels: hdfs-ec-3.0-must-do > Attachments: HADOOP-13665.01.patch, HADOOP-13665.02.patch, > HADOOP-13665.03.patch, HADOOP-13665.04.patch > > > The current EC codec supports a single coder only (by default pure Java > implementation). If the native coder is specified but is unavailable, it > should fallback to pure Java implementation. > One possible solution is to follow the convention of existing Hadoop native > codec, such as transport encryption (see {{CryptoCodec.java}}). It supports > fallback by specifying two or multiple coders as the value of property, and > loads coders in order. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org