Mithun Radhakrishnan created HIVE-17181:
-------------------------------------------

             Summary: HCatOutputFormat should expose complete output-schema 
(including partition-keys) for dynamic-partitioning MR jobs
                 Key: HIVE-17181
                 URL: https://issues.apache.org/jira/browse/HIVE-17181
             Project: Hive
          Issue Type: Bug
          Components: HCatalog
            Reporter: Mithun Radhakrishnan
            Assignee: Mithun Radhakrishnan


Map/Reduce jobs that use HCatalog APIs to write to Hive tables using Dynamic 
partitioning are expected to call the following API methods:
# {{HCatOutputFormat.setOutput()}} to indicate which table/partitions to write 
to. This call populates the {{OutputJobInfo}} with details fetched from the 
Metastore.
# {{HCatOutputFormat.setSchema()}} to indicate the output-schema for the data 
being written.

It is a common mistake to invoke {{HCatOUtputFormat.setSchema()}} as follows:
{code:java}
HCatOutputFormat.setSchema(conf, HCatOutputFormat.getTableSchema(conf));
{code}

Unfortunately, {{getTableSchema()}} returns only the record-schema, not the 
entire table's schema. We'll need a better API for use in M/R jobs to get the 
complete table-schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to