[
https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617713#comment-14617713
]
Ferdinand Xu commented on HIVE-11131:
-------------------------------------
LGTM +1
> Get row information on DataWritableWriter once for better writing performance
> -----------------------------------------------------------------------------
>
> Key: HIVE-11131
> URL: https://issues.apache.org/jira/browse/HIVE-11131
> Project: Hive
> Issue Type: Sub-task
> Affects Versions: 1.2.0
> Reporter: Sergio Peña
> Assignee: Sergio Peña
> Attachments: HIVE-11131.2.patch, HIVE-11131.3.patch,
> HIVE-11131.4.patch
>
>
> DataWritableWriter is a class used to write Hive records to Parquet files.
> This class is getting all the information about how to parse a record, such
> as schema and object inspector, every time a record is written (or write() is
> called).
> We can make this class perform better by initializing some writers per data
> type once, and saving all object inspectors on each writer.
> The class expects that the next records written will have the same object
> inspectors and schema, so there is no need to have conditions for that. When
> a new schema is written, DataWritableWriter is created again by Parquet.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)