[
https://issues.apache.org/jira/browse/HIVE-16295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448922#comment-16448922
]
Sahil Takiar commented on HIVE-16295:
-------------------------------------
Attaching initial patch to get a run of Hive QA in. Here is an update of my
progress so far:
* Attached a WIP prototype type that works for several basic use cases: CTAS,
INSERT INTO, etc.
* Most of the {{hive-blobstore}} tests are passing when run against S3A; there
are a bunch of explain diffs because a new stage is introduced, but the query
outputs are the same
* Dynamic partitioning doesn't work yet
* Haven't really investigated bucketed tables yet, but the tests are passing
locally
* There are a number of hacks that need to be cleaned up, but I think the
overall design is mostly in place
At a high level the design is as follows:
* Introduce a new task that is run before a {{SparkTask}} or {{MapReduceTask}},
this {{Task}} will create a specified {{PathOutputCommitter}} and run the
{{setupJob}} method
* The {{MoveTask}} will do the same thing, but run {{commitJob}}
* Bunch of other changes to {{MoveTask}} so that it doesn't run any of the
{{fs}} operations to commit any data
* The {{S3ACommitOptimization}} is a new physical optimization that does some
setup work the {{FileSinkOperator}} has access to the final output path; it
also handles a number of other setup tasks to make sure the
{{FileSinkOperator}} uses the specified {{PathOutputCommitter}} and sets the
working and output paths correctly
The main caveat is that I don't think this will work when the Hive
Merge-Small-Files job is triggered. The reason is that this job implicitly
depends on the fact that renames are atomic operations, which is not the case
on S3. Right now, I've disabled the job by default, but need to come up with a
cleaner solution. Probably will need to short-circuit the optimizations if the
Merge-Small-Files job is enabled. The only place it is turned on by default is
in when files are written by a Map-only MR job, but we should be able to detect
that scenario and auto disable the committer optimizations.
> Add support for using Hadoop's S3A OutputCommitter
> --------------------------------------------------
>
> Key: HIVE-16295
> URL: https://issues.apache.org/jira/browse/HIVE-16295
> Project: Hive
> Issue Type: Sub-task
> Reporter: Sahil Takiar
> Assignee: Sahil Takiar
> Priority: Major
> Attachments: HIVE-16295.1.WIP.patch
>
>
> Hive doesn't have integration with Hadoop's {{OutputCommitter}}, it uses a
> {{NullOutputCommitter}} and uses its own commit logic spread across
> {{FileSinkOperator}}, {{MoveTask}}, and {{Hive}}.
> The Hadoop community is building an {{OutputCommitter}} that integrates with
> S3Guard and does a safe, coordinate commit of data on S3 inside individual
> tasks (HADOOP-13786). If Hive can integrate with this new {{OutputCommitter}}
> there would be a lot of benefits to Hive-on-S3:
> * Data is only written once; directly committing data at a task level means
> no renames are necessary
> * The commit is done safely, in a coordinated manner; duplicate tasks (from
> task retries or speculative execution) should not step on each other
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)