[jira] [Updated] (HIVE-6476) Support Append with Dynamic Partitioning
[ https://issues.apache.org/jira/browse/HIVE-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated HIVE-6476: --- Affects Version/s: 2.2.0 Target Version/s: 2.2.0 Status: Patch Available (was: Open) > Support Append with Dynamic Partitioning > > > Key: HIVE-6476 > URL: https://issues.apache.org/jira/browse/HIVE-6476 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore, Query Processor, Thrift API >Affects Versions: 2.2.0 >Reporter: Sushanth Sowmyan >Assignee: Mariappan Asokan > Attachments: HIVE-6476.1.patch > > > Currently, we do not support mixing dynamic partitioning and append in the > same job. One reason is that we need exhaustive testing of corner cases for > that, and a second reason is the behaviour of add_partitions. To support > dynamic partitioning with append, we'd have to have a > add_partitions_if_not_exist call, rather than an add_partitions call. > Thus, the current implementation in HIVE-6475 assumes immutability for all > dynamic partitioning jobs, irrespective of whether or not the table is marked > as mutable or not. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-6476) Support Append with Dynamic Partitioning
[ https://issues.apache.org/jira/browse/HIVE-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated HIVE-6476: --- Attachment: HIVE-6476.1.patch > Support Append with Dynamic Partitioning > > > Key: HIVE-6476 > URL: https://issues.apache.org/jira/browse/HIVE-6476 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore, Query Processor, Thrift API >Reporter: Sushanth Sowmyan >Assignee: Mariappan Asokan > Attachments: HIVE-6476.1.patch > > > Currently, we do not support mixing dynamic partitioning and append in the > same job. One reason is that we need exhaustive testing of corner cases for > that, and a second reason is the behaviour of add_partitions. To support > dynamic partitioning with append, we'd have to have a > add_partitions_if_not_exist call, rather than an add_partitions call. > Thus, the current implementation in HIVE-6475 assumes immutability for all > dynamic partitioning jobs, irrespective of whether or not the table is marked > as mutable or not. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-6476) Support Append with Dynamic Partitioning
[ https://issues.apache.org/jira/browse/HIVE-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134270#comment-16134270 ] Mariappan Asokan commented on HIVE-6476: Sushanth, Here are the changes I made in the uploaded patch: * Moved some common code to new private methods ({{moveFiles()}} and {{isReservedName()}}) * When dynamic partitioning is used, the table can be mutable * For dynamic partitioning, when a partition does not exist, the optimization to move the entire directory to the target location is still in effect. However, when a partition exists, newly added files to the partition are moved to the existing target directory one at a time * The unique name generation logic is applied to only non-directory files * In addition to deleting the newly created partitions in the metadata server, all newly added files will be deleted when a commit fails * Fixed a minor bug: when there is no new file to move ({{firstChild}} == {{null}} in {{moveTaskOutputs()}}), no action will be taken to avoid null pointer dereferencing. * Created tests that test different cases: appending new records such that all records go to existing partitions, some records to existing partitions and others to new partitions, and all records to new partitions * Deleted an existing test that tested the failure of dynamic partitioning and append Please provide your feedback. Thanks. > Support Append with Dynamic Partitioning > > > Key: HIVE-6476 > URL: https://issues.apache.org/jira/browse/HIVE-6476 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore, Query Processor, Thrift API >Reporter: Sushanth Sowmyan >Assignee: Mariappan Asokan > Attachments: HIVE-6476.1.patch > > > Currently, we do not support mixing dynamic partitioning and append in the > same job. One reason is that we need exhaustive testing of corner cases for > that, and a second reason is the behaviour of add_partitions. To support > dynamic partitioning with append, we'd have to have a > add_partitions_if_not_exist call, rather than an add_partitions call. > Thus, the current implementation in HIVE-6475 assumes immutability for all > dynamic partitioning jobs, irrespective of whether or not the table is marked > as mutable or not. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17068) HCatalog: Add parquet support
[ https://issues.apache.org/jira/browse/HIVE-17068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081238#comment-16081238 ] Mariappan Asokan commented on HIVE-17068: - Hi Prasanth, I am wondering whether HIVE-8838 is related to this Jira. Thanks. > HCatalog: Add parquet support > - > > Key: HIVE-17068 > URL: https://issues.apache.org/jira/browse/HIVE-17068 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17068.1.patch, HIVE-17068.2.patch > > > MapredParquetOutputFormat has to support getRecordWriter() for parquet format > to be used from HCatalog. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-6476) Support Append with Dynamic Partitioning
[ https://issues.apache.org/jira/browse/HIVE-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250277#comment-15250277 ] Mariappan Asokan commented on HIVE-6476: Sushanth, thank you. This is very helpful. I will dig into the code and if I have any questions I will let you know. Is FileOutputCommitterContainer.java a good place to start? Can I assign this Jira to me? > Support Append with Dynamic Partitioning > > > Key: HIVE-6476 > URL: https://issues.apache.org/jira/browse/HIVE-6476 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore, Query Processor, Thrift API >Reporter: Sushanth Sowmyan > > Currently, we do not support mixing dynamic partitioning and append in the > same job. One reason is that we need exhaustive testing of corner cases for > that, and a second reason is the behaviour of add_partitions. To support > dynamic partitioning with append, we'd have to have a > add_partitions_if_not_exist call, rather than an add_partitions call. > Thus, the current implementation in HIVE-6475 assumes immutability for all > dynamic partitioning jobs, irrespective of whether or not the table is marked > as mutable or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6476) Support Append with Dynamic Partitioning
[ https://issues.apache.org/jira/browse/HIVE-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243005#comment-15243005 ] Mariappan Asokan commented on HIVE-6476: I have the same question: What are the corner cases that need to be tested? Dynamic partitioning with append is a very common use case. Sushanth, if you can elaborate on the "corner cases" and give some pointers I can pick up this Jira and work on it. Thanks. > Support Append with Dynamic Partitioning > > > Key: HIVE-6476 > URL: https://issues.apache.org/jira/browse/HIVE-6476 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore, Query Processor, Thrift API >Reporter: Sushanth Sowmyan > > Currently, we do not support mixing dynamic partitioning and append in the > same job. One reason is that we need exhaustive testing of corner cases for > that, and a second reason is the behaviour of add_partitions. To support > dynamic partitioning with append, we'd have to have a > add_partitions_if_not_exist call, rather than an add_partitions call. > Thus, the current implementation in HIVE-6475 assumes immutability for all > dynamic partitioning jobs, irrespective of whether or not the table is marked > as mutable or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)