[
https://issues.apache.org/jira/browse/HBASE-26259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422344#comment-17422344
]
Andrew Kyle Purtell commented on HBASE-26259:
---------------------------------------------
Similar to the _hbase-compression-snappy_ module, implemented with the Xerial
snappy JNI wrapper, for those who would prefer that over the aircompressor pure
Java option or the Hadoop native codec, I have added an
_hbase-compression-zstd_ module, which implements ZStandard support with BSD
licensed [zstd-jni|https://github.com/luben/zstd-jni/], for those who would
prefer it over the (limited/in development) aircompressor pure Java option or
the Hadoop native codec.
This addition was completed with less than 30 minutes of effort, and another 30
minutes or so for testing, which demonstrates how these compression modules can
be used to implement support for additional options easily.
> Fallback support to pure Java compression
> -----------------------------------------
>
> Key: HBASE-26259
> URL: https://issues.apache.org/jira/browse/HBASE-26259
> Project: HBase
> Issue Type: Sub-task
> Reporter: Andrew Kyle Purtell
> Assignee: Andrew Kyle Purtell
> Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-2
>
> Attachments: BenchmarkCodec.java, BenchmarksMain.java,
> RandomDistribution.java, ac_lz4_results.pdf, ac_snappy_results.pdf,
> ac_zstd_results.pdf, lz4_lz4-java_result.pdf, xerial_snappy_results.pdf
>
>
> Airlift’s aircompressor
> (https://search.maven.org/artifact/io.airlift/aircompressor) is an Apache 2
> licensed library, for Java 8 and up, available in Maven central, which
> provides pure Java implementations of gzip, lz4, lzo, snappy, and zstd and
> Hadoop compression codecs for same, claiming “_they are typically 300% faster
> than the JNI wrappers_.” (https://github.com/airlift/aircompressor). This
> library is under active development and up to date releases because it is
> used by Trino.
> Proposed changes:
> * Modify Compression.java such that compression codec implementation classes
> can be specified by configuration. Currently they are hardcoded as strings.
> * Pull in aircompressor as a ‘compile’ time dependency so it will be bundled
> into our build and made available on the server classpath.
> * Modify Compression.java to fall back to an aircompressor pure Java
> implementation if schema specifies a compression algorithm, a Hadoop native
> codec was specified as desired implementation, but the requisite native
> support is somehow not available.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)