[
https://issues.apache.org/jira/browse/HCATALOG-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242516#comment-13242516
]
[email protected] commented on HCATALOG-36:
-------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4437/
-----------------------------------------------------------
(Updated 2012-03-30 16:36:00.761727)
Review request for hcatalog, Owen O'Malley, Ranjit Mathew, and Alejandro
Abdelnur.
Changes
-------
Changed the patch to be use a Generic MultiOutputFormat instead of a hcat
specific MultiTableOutputFormat that can be later ported to mapreduce.
Rationale for MultiOutputFormat:
Mapred already has MultipleOutputFormat and MultiOutputs and we are
creating yet another one. Rejected MultipleOutputFormat as it does not fit the
use case much and it extends FileOutputFormat. MultipleOutputs was more generic
and close enough and the APIs were cleaner but lacked the following that we
needed:
1) Does not extend OutputFormat. Because of that lacks support for
checkOutputSpecs() and OutputCommitters.
2) Will not handle multiple mapred.output.dir and is still based on on
FileOutputFormat.
3) Does not handle cases where an OutputFormat introduces its own
configuration into the JobConf and based on that conf, checkOutputSpecs(),
getRecordWriter() and OutputCommitter need to be invoked. This is very much
needed for HCatOutputFormat.
Summary (updated)
-------
Patch description
1) Created a Generic MultiOutputFormat instead of a hcat specific
MultiTableOutputFormat that can be later ported to mapreduce.
Classes - MultiOutputFormat.java, TestMultiOutputFormat.java. HCat related
tests for MultiOutputFormat in TestHCatMultiOutputFormat.java
2) Added closeHiveClientQuietly() as unit tests were failing because of
HIVE-2883/HCATALOG-236
3) Added setting file permissions correctly for the output files in addition to
the partition directory in FileOutputCommitterContainer. This was required with
multiple tables.
This addresses bug HCATALOG-36.
https://issues.apache.org/jira/browse/HCATALOG-36
Diffs (updated)
-----
http://svn.apache.org/repos/asf/incubator/hcatalog/trunk/src/java/org/apache/hcatalog/common/HCatUtil.java
1307252
http://svn.apache.org/repos/asf/incubator/hcatalog/trunk/src/java/org/apache/hcatalog/mapreduce/DefaultOutputCommitterContainer.java
1307252
http://svn.apache.org/repos/asf/incubator/hcatalog/trunk/src/java/org/apache/hcatalog/mapreduce/FileOutputCommitterContainer.java
1307252
http://svn.apache.org/repos/asf/incubator/hcatalog/trunk/src/java/org/apache/hcatalog/mapreduce/FileOutputFormatContainer.java
1307252
http://svn.apache.org/repos/asf/incubator/hcatalog/trunk/src/java/org/apache/hcatalog/mapreduce/FosterStorageHandler.java
1307252
http://svn.apache.org/repos/asf/incubator/hcatalog/trunk/src/java/org/apache/hcatalog/mapreduce/MultiOutputFormat.java
PRE-CREATION
http://svn.apache.org/repos/asf/incubator/hcatalog/trunk/src/test/org/apache/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
PRE-CREATION
http://svn.apache.org/repos/asf/incubator/hcatalog/trunk/src/test/org/apache/hcatalog/mapreduce/TestMultiOutputFormat.java
PRE-CREATION
Diff: https://reviews.apache.org/r/4437/diff
Testing (updated)
-------
Unit and integration tested.
Thanks,
Rohini
> Support Writing Out to Multiple Tables in HCatOutputFormat
> ----------------------------------------------------------
>
> Key: HCATALOG-36
> URL: https://issues.apache.org/jira/browse/HCATALOG-36
> Project: HCatalog
> Issue Type: Improvement
> Affects Versions: 0.2
> Reporter: Ranjit Mathew
> Assignee: Rohini Palaniswamy
> Attachments: multihcat.tgz
>
>
> HCatOutputFormat does not support writing out to multiple tables (or
> partitions for that matter).
> Add this support to HCatalog.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira