[ 
https://issues.apache.org/jira/browse/HIVE-10016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495723#comment-14495723
 ] 

Dong Chen commented on HIVE-10016:
----------------------------------

Thanks for working on the branch! [~spena]

I am uploading the patch, but a problem occurs. When I rebase the latest patch 
'HIVE-10016.patch' (target to trunk) to 'parquet' branch, a merge confilct 
happens. This is because the code of branch is behind trunk about one month.

Do you think we sync the branch first, and then update the patch? (If so, I 
will rebase the latest patch after branch is sync-ed)

Or we merge all the patches first, and then sync with trunk, resolve conflict 
together? (If so, patch 'HIVE-10016.1-parquet.patch' is ok for committing now)

> Remove duplicated Hive table schema parsing in DataWritableReadSupport
> ----------------------------------------------------------------------
>
>                 Key: HIVE-10016
>                 URL: https://issues.apache.org/jira/browse/HIVE-10016
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Dong Chen
>            Assignee: Dong Chen
>         Attachments: HIVE-10016-parquet.patch, HIVE-10016.1-parquet.patch, 
> HIVE-10016.patch
>
>
> In {{DataWritableReadSupport.init()}}, the table schema is created and its 
> string format is set in conf. When construct the 
> {{ParquetRecordReaderWrapper}} , the schema is fetched from conf and parsed 
> several times.
> We could remove these schema parsing, and improve the speed of 
> getRecordReader  a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to