[
https://issues.apache.org/jira/browse/SPARK-18024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Reynold Xin updated SPARK-18024:
--------------------------------
Description:
This commit protocol API should wrap around Hadoop's output committer. Later we
can expand the API to cover streaming commits.
The existing Hadoop output committer API is insufficient for streaming use
cases:
1. It has no way for tasks to pass information back to the driver.
2. It relies on the weird Hadoop hashmap to pass information from the driver to
the executors, largely because there is no support for language integration and
serialization in Hadoop MapReduce. Spark has more natural support for passing
information through automatic closure serialization.
was:
This commit protocol API should wrap around Hadoop's output committer. Later we
can expand the API to cover streaming commits.
> Introduce a commit protocol API along with OutputCommitter implementation
> -------------------------------------------------------------------------
>
> Key: SPARK-18024
> URL: https://issues.apache.org/jira/browse/SPARK-18024
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Reporter: Reynold Xin
> Assignee: Reynold Xin
> Fix For: 2.1.0
>
>
> This commit protocol API should wrap around Hadoop's output committer. Later
> we can expand the API to cover streaming commits.
> The existing Hadoop output committer API is insufficient for streaming use
> cases:
> 1. It has no way for tasks to pass information back to the driver.
> 2. It relies on the weird Hadoop hashmap to pass information from the driver
> to the executors, largely because there is no support for language
> integration and serialization in Hadoop MapReduce. Spark has more natural
> support for passing information through automatic closure serialization.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]