[ 
https://issues.apache.org/jira/browse/HIVE-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669500#action_12669500
 ] 

Zheng Shao commented on HIVE-34:
--------------------------------

A recent performance study from Rodrigo showed that creating new String objects 
for each column in each row is a big performance overhead.
We might want to do lazy initialization to get rid of the cost of creating new 
String objects (or use modified Text class).

> Make DynamicSerDe capable of skipping fields that will not be used in the 
> query
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-34
>                 URL: https://issues.apache.org/jira/browse/HIVE-34
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Pete Wyckoff
>
> Thrift/DynamicSerDe always deseriualize and convert fields to the correct 
> type for every field in the record. Many times, only a few of the fields will 
> be used.
> e.g., select foo.user from foo where foo.created < 'today'
> where foo is something like
> struct {
>   string user
>    i64 created
>    string fullname
>    string description
>     i32 something
>     i32 somethingelse
>    ...
> }
> Parsing fullname, description, something and something else is a waste in 
> this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to