[
https://issues.apache.org/jira/browse/HBASE-26258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Kyle Purtell resolved HBASE-26258.
-----------------------------------------
Fix Version/s: (was: 3.0.0-alpha-2)
(was: 2.5.0)
Resolution: Fixed
> Universal compression support
> -----------------------------
>
> Key: HBASE-26258
> URL: https://issues.apache.org/jira/browse/HBASE-26258
> Project: HBase
> Issue Type: Improvement
> Components: HFile, Operability
> Reporter: Andrew Kyle Purtell
> Assignee: Andrew Kyle Purtell
> Priority: Major
>
> Some Hadoop compression codecs became more available in recent Hadoop 3.x
> releases, addressed by HBASE-25940. This is nice but still requires native
> platform support, which to state the obvious is not available on all
> platforms and architectures, even if native libaries for some are bundled
> into jars.
> Airlift's aircompressor
> (https://search.maven.org/artifact/io.airlift/aircompressor) is an Apache 2
> licensed library, for Java 8 and up, available in Maven central, which
> provides pure Java implementations of desirable compression algorithms gzip,
> lz4, lzo, snappy, and zstd, and Hadoop compression codecs for same, claiming
> "_they are typically 300% faster than the JNI wrappers_."
> (https://github.com/airlift/aircompressor). This library is under active
> development and has up to date releases because it is used by Trino.
> We have another project that depends on universal availability of SNAPPY. I
> would like to make this change as a general improvement which also satisfies
> that requirement. (The as yet unnamed project will be contributed later.) It
> will be a very nice-to-have to have universal ZSTD support available as well.
> Proposed changes:
> * Modify Compression.java such that compression codec implementation classes
> can be specified by configuration. Currently they are hardcoded as strings.
> * Pull in aircompressor as a 'compile' time dependency so it will be bundled
> into our build and made available on the server classpath.
> * Modify Compression.java to fall back to an aircompressor pure Java
> implementation if schema specifies a compression algorithm, a Hadoop native
> codec was specified as desired implementation, but the requisite native
> support is somehow not available.
> The combination of these changes will provide universal (pure Java) support
> for these desired and desirable compression codecs while retaining default
> behavior, which is to load and utilize Hadoop native implementations of same,
> if native support is available. They will also let you override this default
> if you wish to chase the claimed benefits of the pure Java alternatives.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)