[
https://issues.apache.org/jira/browse/HIVE-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16107737#comment-16107737
]
Sergey Shelukhin commented on HIVE-17214:
-----------------------------------------
There's a setting in Hive (and Tez/MR?) that basically makes it enumerate input
directory contents recursively. Theoretically, the paths could be completely
arbitrary. Union for Tez is just a special case that makes use of this feature.
Not sure if anyone actually cares otherwise. Flattening it might be an option
when converting to ACID... or throwing an error, and then flattening if some
parameter is passed to the alter query/some config setting is set.
> check/fix conversion of non-acid to acid
> ----------------------------------------
>
> Key: HIVE-17214
> URL: https://issues.apache.org/jira/browse/HIVE-17214
> Project: Hive
> Issue Type: Sub-task
> Components: Transactions
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
>
> bucketed tables have stricter rules for file layout on disk - bucket files
> are direct children of a partition directory.
> for un-bucketed tables I'm not sure there are any rules
> for example, CTAS with Tez + Union operator creates 1 directory for each leg
> of the union
> Supposedly Hive can read table by picking all files recursively.
> Can it also write (other than CTAS example above) arbitrarily?
> Does it mean Acid write can also write anywhere?
> Figure out what can be supported and how can existing layout can be checked?
> Examining a full "ls -l -R" for a large table could be expensive.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)