[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689151#action_12689151
]
He Yongqiang commented on HIVE-352:
-----------------------------------
One problem with this RCFile is that it needs to know the needed columns in
advance, so it can skip and avoid decompress unneeded columns.
I took a look at Hive's operators and SerDe, it seems that they all take a
whole row object as input and do not know which column is needed before
processing.
Like with LazyStruct and StructObjectInspector, they only know which column is
needed when getField/getStructFieldData is invoked by operators' evalators(
like ExprNodeColumnEvaluator).
> Make Hive support column based storage
> --------------------------------------
>
> Key: HIVE-352
> URL: https://issues.apache.org/jira/browse/HIVE-352
> Project: Hadoop Hive
> Issue Type: New Feature
> Reporter: He Yongqiang
>
> column based storage has been proven a better storage layout for OLAP.
> Hive does a great job on raw row oriented storage. In this issue, we will
> enhance hive to support column based storage.
> Acctually we have done some work on column based storage on top of hdfs, i
> think it will need some review and refactoring to port it to Hive.
> Any thoughts?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.