[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703616#action_12703616
]
Zheng Shao commented on HIVE-352:
---------------------------------
Good news: A test on some of our internal data shows that the column-based
storage is saving 20%+ space for us. However it is about 1.5x slower than
seqfile in writing. Not sure why yet. Will do more profiling tomorrow.
> Make Hive support column based storage
> --------------------------------------
>
> Key: HIVE-352
> URL: https://issues.apache.org/jira/browse/HIVE-352
> Project: Hadoop Hive
> Issue Type: New Feature
> Reporter: He Yongqiang
> Assignee: He Yongqiang
> Attachments: 4-22 performace2.txt, 4-22 performance.txt, 4-22
> progress.txt, hive-352-2009-4-15.patch, hive-352-2009-4-16.patch,
> hive-352-2009-4-17.patch, hive-352-2009-4-19.patch,
> hive-352-2009-4-22-2.patch, hive-352-2009-4-22.patch,
> hive-352-2009-4-23.patch, hive-352-2009-4-27.patch,
> HIve-352-draft-2009-03-28.patch, Hive-352-draft-2009-03-30.patch
>
>
> column based storage has been proven a better storage layout for OLAP.
> Hive does a great job on raw row oriented storage. In this issue, we will
> enhance hive to support column based storage.
> Acctually we have done some work on column based storage on top of hdfs, i
> think it will need some review and refactoring to port it to Hive.
> Any thoughts?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.