[ 
https://issues.apache.org/jira/browse/HIVE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744029#action_12744029
 ] 

He Yongqiang commented on HIVE-756:
-----------------------------------

Changes for ColumnarStruct is good.
@hive-756.patch: line 100:
The new var "prjColIDs" is not necessary. The cost wouldn't deviate too much 
from using a boolean array to do the same things. 
{quote}
1. Columns information is provided but empty: we ignore all columns
2. Columns information is not provided: we read all columns.
In this way if the caller (some non-hive applications) does not know the RCFile 
column information settings, it can still read in all columns.
{quote}
Agree. We can use "none" as the conf value to denote empty columns, and use "" 
to denote all columns. The code for setting and reading lies in 
HiveFileFormatUtils. 

> performance improvement for RCFile and ColumnarSerDe in Hive
> ------------------------------------------------------------
>
>                 Key: HIVE-756
>                 URL: https://issues.apache.org/jira/browse/HIVE-756
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: hive-756.patch
>
>
> There are some easy performance improvements in the columnar storage in Hive 
> I found during Hackathon. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to