churro morales commented on HBASE-16710:

[~saint....@gmail.com] hadoop will use the suffix zst to denote the file is 
compressed in ZStandard format.  I have a patch up right now that does work 
with HBase but unfortunately all of the block or chunked based compressors add 
a header in front of the chunk to denote decompression size.  Thus compression 
libraries like snappy don't work when you compress outside of the hadoop 
ecosystem.  I am currently working on a streaming based approach such that if 
you compress through the cli and put the file into hdfs it will compress / 

I have tested the current patch by back porting to a hadoop 2.x branch with 
HBase and it works.  The compression / decompression speed as well as 
compression ratios are very good.  It should be a few weeks to finish up the 
streaming patch as I have other things on my plate. 

I'll update this JIRA once I get something upstream. 

> Add ZStandard Codec to Compression.java
> ---------------------------------------
>                 Key: HBASE-16710
>                 URL: https://issues.apache.org/jira/browse/HBASE-16710
>             Project: HBase
>          Issue Type: Task
>    Affects Versions: 2.0.0
>            Reporter: churro morales
>            Assignee: churro morales
>            Priority: Minor
>         Attachments: HBASE-16710-0.98.patch, HBASE-16710-1.2.patch, 
> HBASE-16710.patch
> HADOOP-13578 is adding the ZStandardCodec to hadoop.  This is a placeholder 
> to ensure it gets added to hbase once this gets upstream.

This message was sent by Atlassian JIRA

Reply via email to