[
https://issues.apache.org/jira/browse/HIVE-16722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16204067#comment-16204067
]
Eugene Koifman commented on HIVE-16722:
---------------------------------------
patch 1 will validate the names of the files when transactional=true is set for
existing table.
The work in HIVE-17204 ensures that non-acid to acid conversion can handle
original files (_OrcRawRecordMerger_) which can be in subdirectories of
table/partition.
> Converting bucketed non-acid table to acid should perform validation
> --------------------------------------------------------------------
>
> Key: HIVE-16722
> URL: https://issues.apache.org/jira/browse/HIVE-16722
> Project: Hive
> Issue Type: Sub-task
> Components: Transactions
> Affects Versions: 1.0.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
> Attachments: HIVE-16722.01.patch, HIVE-16722.WIP.patch
>
>
> Converting a non acid table to acid only performs metadata validation (in
> _TransactionalValidationListener_).
> The data read code path only understands certain directory layouts and file
> names and ignores (generally) files that don't match the expected format.
> In Hive, directory layout and bucket file naming (especially older releases)
> is poorly enforced.
> Need to add a validation step on
> {noformat}
> alter table T SET TBLPROPERTIES ('transactional'='true')
> {noformat}
> to
> scan the file system and report any possible data loss scenarios.
> Currently Acid understands bucket files name like "00000_0" and (with
> HIVE-16177) 00000_0_copy1" etc at the root of the partition.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)