[
https://issues.apache.org/jira/browse/HIVE-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015706#comment-13015706
]
He Yongqiang commented on HIVE-2089:
------------------------------------
Actually just found that the recent hadoop's combineFileInputFormat support not
splittable files as input. So it won't be a problem for .gz files if the hadoop
has the feature checked in.
Another use case for it is Hive's SymlinkInputFormat, which may point to too
many .gz files.
> Add a new input format to be able to combine multiple .gz text files
> --------------------------------------------------------------------
>
> Key: HIVE-2089
> URL: https://issues.apache.org/jira/browse/HIVE-2089
> Project: Hive
> Issue Type: New Feature
> Reporter: He Yongqiang
> Assignee: He Yongqiang
> Attachments: HIVE-2089.1.patch
>
>
> For files that is not splittable, CombineHiveInputFormat won't help. This
> jira is to add a new inputformat to support this feature. This is very useful
> for partitions with tens of thousands of .gz files.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira