allow outputcommitters to skip setup/cleanup
--------------------------------------------
Key: MAPREDUCE-1802
URL: https://issues.apache.org/jira/browse/MAPREDUCE-1802
Project: Hadoop Map/Reduce
Issue Type: Bug
Reporter: Joydeep Sen Sarma
Assignee: Joydeep Sen Sarma
Job setup and cleanup overheads in our (larger) clusters are very significant
and add to latency for small jobs. It turns out that Hive does not require job
setup and cleanup at all - since all management of output/temporary files and
such is done by the hive client side. So it would be a big win for our
environment (and Hive users in general) if we could skip job cleanup/setup
altogether.
The proposal is to add new calls to OutputCommitter interface (along the lines
of needsTaskCommit()) to optionally allow skipping of setup/cleanup and for the
JT to take these into account while scheduling setup/cleanup. NullOutputFormat
should not need setup/cleanup for example.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.