[
https://issues.apache.org/jira/browse/HIVE-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757809#action_12757809
]
Ning Zhang commented on HIVE-819:
---------------------------------
Hi Yongqiang, the tests look good. It should cover most cases. Other queries
such as map-reduce joins, map-side joins, UDF, UDAF, etc may fall into the same
code path. Namit and Zheng may correct me if I'm wrong.
As for where the check of double-decompression should be put, I prefer putting
it in LazyDecompressionCallbackImpl since this integrity check is introduced by
the lazy decompression, thus is part of its responsibility. And there may be
more callers besides BytesRefWritable to LazyDecompressionCallbackImpl. If
putting it in LazyDecompressionCallbackImpl, we dont' need to implement the
checking in each of its callers.
> Add lazy decompress ability to RCFile
> -------------------------------------
>
> Key: HIVE-819
> URL: https://issues.apache.org/jira/browse/HIVE-819
> Project: Hadoop Hive
> Issue Type: Improvement
> Components: Query Processor, Serializers/Deserializers
> Reporter: He Yongqiang
> Assignee: He Yongqiang
> Fix For: 0.5.0
>
> Attachments: hive-819-2009-9-12.patch
>
>
> This is especially useful for a filter scanning.
> For example, for query 'select a, b, c from table_rc_lazydecompress where
> a>1;' we only need to decompress the block data of b,c columns when one row's
> column 'a' in that block satisfies the filter condition.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.