[
https://issues.apache.org/jira/browse/HIVE-18391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Deepak Jaiswal updated HIVE-18391:
----------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
With result files done in separate patches, this jira is not needed.
The work will be tracked in parent HIVE-18350
> load data should rename files consistent with insert statements (bucketed
> tables only)
> --------------------------------------------------------------------------------------
>
> Key: HIVE-18391
> URL: https://issues.apache.org/jira/browse/HIVE-18391
> Project: Hive
> Issue Type: Sub-task
> Reporter: Deepak Jaiswal
> Assignee: Deepak Jaiswal
> Attachments: HIVE-18391.1.patch, HIVE-18391.2.patch,
> HIVE-18391.3.patch
>
>
> Insert statements create files of format ending with 0000_0, 0001_0 etc.
> However, the load data uses the input file name. That results in inconsistent
> naming convention which makes SMB joins difficult in some scenarios and may
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For bucketed table, hive relies on user to name the files matching the bucket
> in non-strict mode. Hive assumes that the data belongs to same bucket in a
> file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty
> significant.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)