[jira] Commented: (HIVE-352) Make Hive support column based storage

Zheng Shao (JIRA) Thu, 30 Apr 2009 12:54:55 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704765#action_12704765
 ]


Zheng Shao commented on HIVE-352:
---------------------------------

>>>>Writer: how do you pass the column number from Hive to the configuration 
>>>>and then to the RCFIle.Writer?
>>The code is in RCFileOutputFormat's getHiveRecordWriter(). It tries to parse 
>>the columns from passed in Properties.
Thanks. I understand it now.

>>>>init(...): Cleaning out the object and recreate LazyObject is not efficient.
>>If we change it, it will not pass the TestRCFile test. The final extra else 
>>if statements are rarely reached, and when reached, most time it only needs 
>>one instruction to determine whether fields[fieldIndex] is null.

Can you add a boolean[] fieldIsNull to mark whether a field is null, instead of 
throwing away and recreating the LazyObject?
Then getField can check fieldIsNull to decide whether to return null or the 
LazyObject.


> Make Hive support column based storage
> --------------------------------------
>
>                 Key: HIVE-352
>                 URL: https://issues.apache.org/jira/browse/HIVE-352
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: 4-22 performace2.txt, 4-22 performance.txt, 4-22 
> progress.txt, hive-352-2009-4-15.patch, hive-352-2009-4-16.patch, 
> hive-352-2009-4-17.patch, hive-352-2009-4-19.patch, 
> hive-352-2009-4-22-2.patch, hive-352-2009-4-22.patch, 
> hive-352-2009-4-23.patch, hive-352-2009-4-27.patch, 
> hive-352-2009-4-30-2.patch, hive-352-2009-4-30-3.patch, 
> hive-352-2009-4-30-4.patch, hive-352-2009-5-1.patch, 
> HIve-352-draft-2009-03-28.patch, Hive-352-draft-2009-03-30.patch
>
>
> column based storage has been proven a better storage layout for OLAP. 
> Hive does a great job on raw row oriented storage. In this issue, we will 
> enhance hive to support column based storage. 
> Acctually we have done some work on column based storage on top of hdfs, i 
> think it will need some review and refactoring to port it to Hive.
> Any thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-352) Make Hive support column based storage

Reply via email to