Hi Jim and Tim: Thanks for your reply. I know APL and GPL, here is some discusses about Hadoop supports for lzma: https://issues.apache.org/jira/browse/HADOOP-6837 <https://issues.apache.org/jira/browse/HADOOP-6837.>
2017-06-12 23:40 GMT+08:00 Jim Apple <[email protected]>: > Because Impala is part of the ASF, it cannot contain any GPL code. > > https://www.apache.org/legal/resolved.html > > "However, if the component is only needed for optional features, a > project can provide the user with instructions on how to obtain and > install the non-included work. Optional means that the component is > not required for standard use of the product or for the product to > achieve a desirable level of quality. The question to ask yourself in > this situation is: 'Will the majority of users want to use my product > without adding the optional components?'" > > As I understand it, this is the rule by which Impala can use > https://github.com/cloudera/impala-lzo > > On Mon, Jun 12, 2017 at 8:30 AM, Tim Armstrong <[email protected]> > wrote: > > You would need to add a new codec to the Impala source tree. The codecs > are > > implemented in be/src/util/codec.h, be/src/util/compress.h and > > be/src/util/decompress.h. There are a few other places you may need to > > change. I would just "git grep -i gzip" to see how the gzip codec is > > implemented. > > > > For compressed text files you would also need to add support to the > > frontend, e.g. in > > fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java > > > > I'm also not sure if there are any licensing issues here since the XZ > > library is GPL licensed. > > > > On Sat, Jun 10, 2017 at 5:41 PM, 孙清孟 <[email protected]> wrote: > > > >> I have added lzma codec (hadoop-xz) to parquet(modify the parquet-format > >> and parquet-mr) for hive, and get a higher compression ratio. > >> > >> But how add a new codec for Impala? > >> >
