Thiruvel Thirumoolan created HIVE-8371:
------------------------------------------
Summary: HCatStorer should fail by default when publishing to an
existing partition
Key: HIVE-8371
URL: https://issues.apache.org/jira/browse/HIVE-8371
Project: Hive
Issue Type: Bug
Components: HCatalog
Affects Versions: 0.13.1, 0.13.0, 0.14.0
Reporter: Thiruvel Thirumoolan
In Hive-12 and before (on in previous HCatalog releases) HCatStorer would fail
if the partition already exists (whether before launching the job or during
commit depending on the partitioning). HIVE-6406 changed that behavior and by
default does an append. This causes data quality issues since an rerun (or
duplicate run) won't fail (when it used to) and will just append to the
partition.
A preferable approach would be to leave HCatStorer behavior as is (fail during
a duplicate publish) and support append through an option. Overwrite also can
be implemented in a similar fashion. Eg:
store A into 'db.table' using
org.apache.hive.hcatalog.pig.HCatStorer('partspec', '', ' -append');
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)