[
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16778612#comment-16778612
]
Vineet Garg commented on HIVE-21279:
------------------------------------
[~ashutoshc] For some reason I wasn't able to create review board request. I
have created a pull request at https://github.com/apache/hive/pull/552.
> Avoid moving/rename operation in FileSink op for SELECT queries
> ---------------------------------------------------------------
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
> Issue Type: Improvement
> Components: Query Planning
> Reporter: Vineet Garg
> Assignee: Vineet Garg
> Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.2.patch,
> HIVE-21279.3.patch, HIVE-21279.4.patch, HIVE-21279.5.patch,
> HIVE-21279.6.patch, HIVE-21279.7.patch, HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory
> to another directory from which FetchTask fetches result. This is done to
> avoid fetching potential partial/invalid files by failed/runway tasks. This
> operation is expensive for cloud storage. It could be avoided if FetchTask is
> passed on set of files to read from instead of whole directory.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)