[
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609043#comment-14609043
]
Rohit Dholakia commented on HIVE-10438:
---------------------------------------
1. We see your point. But we believe that ResultSet compression using type
information delivers better compression ratios. For instance, using the integer
plugin attached with the patch has 10% more compression ratio than Snappy
(results also attached). I think we can incorporate your suggestion by adding a
switch to specify whether to use type-sensitive (different compressors for
different column types) or type-insensitive compression (e.g same technology
for all column types). For this, the interface ColumnCompressor will only need
to be extended by one method.
3. We have now done update patch.
4. We have added a few tests to the src/test folder of hive-service which uses
Snappy for compression and decompression using a few default values.
Thanks.
> Architecture for ResultSet Compression via external plugin
> -----------------------------------------------------------
>
> Key: HIVE-10438
> URL: https://issues.apache.org/jira/browse/HIVE-10438
> Project: Hive
> Issue Type: New Feature
> Components: Hive, Thrift API
> Affects Versions: 1.2.0
> Reporter: Rohit Dholakia
> Assignee: Rohit Dholakia
> Labels: patch
> Attachments: HIVE-10438-1.patch, HIVE-10438.patch,
> Proposal-rscompressor.pdf, README.txt,
> Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip,
> hs2driver-master.zip
>
>
> This JIRA proposes an architecture for enabling ResultSet compression which
> uses an external plugin.
> The patch has three aspects to it:
> 0. An architecture for enabling ResultSet compression with external plugins
> 1. An example plugin to demonstrate end-to-end functionality
> 2. A container to allow everyone to write and test ResultSet compressors with
> a query submitter (https://github.com/xiaom/hs2driver)
> Also attaching a design document explaining the changes, experimental results
> document, and a pdf explaining how to setup the docker container to observe
> end-to-end functionality of ResultSet compression.
> https://reviews.apache.org/r/35792/ Review board link.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)