[
https://issues.apache.org/jira/browse/HIVE-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kevin Wilfong resolved HIVE-3593.
---------------------------------
Resolution: Not A Problem
Actually, the regex to get the task ID already avoids this problem.
> Output files of SMB join grow indefinitely
> ------------------------------------------
>
> Key: HIVE-3593
> URL: https://issues.apache.org/jira/browse/HIVE-3593
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Reporter: Kevin Wilfong
> Assignee: Kevin Wilfong
>
> The output files of a SMB join are prefixed by the big table's partition spec
> that was used to create them. The length of the bucket number portion of the
> file name is updated to be the same length as the length of the task ID.
> Since the task ID is the name of the file, this means that if the output of a
> SMB join is used as the big table of another SMB join, the output files will
> increase by the size of the original partition spec. Compound this and the
> file size can grow indefinitely.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira