[ 
https://issues.apache.org/jira/browse/SPARK-18024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-18024:
--------------------------------
    Description: 
This commit protocol API should wrap around Hadoop's output committer. Later we 
can expand the API to cover streaming commits.

The existing Hadoop output committer API is insufficient for streaming use 
cases:

1. It has no way for tasks to pass information back to the driver.

2. It relies on the weird Hadoop hashmap to pass information from the driver to 
the executors, largely because there is no support for language integration and 
serialization in Hadoop MapReduce. Spark has more natural support for passing 
information through automatic closure serialization.


  was:
This commit protocol API should wrap around Hadoop's output committer. Later we 
can expand the API to cover streaming commits.



> Introduce a commit protocol API along with OutputCommitter implementation
> -------------------------------------------------------------------------
>
>                 Key: SPARK-18024
>                 URL: https://issues.apache.org/jira/browse/SPARK-18024
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Reynold Xin
>            Assignee: Reynold Xin
>             Fix For: 2.1.0
>
>
> This commit protocol API should wrap around Hadoop's output committer. Later 
> we can expand the API to cover streaming commits.
> The existing Hadoop output committer API is insufficient for streaming use 
> cases:
> 1. It has no way for tasks to pass information back to the driver.
> 2. It relies on the weird Hadoop hashmap to pass information from the driver 
> to the executors, largely because there is no support for language 
> integration and serialization in Hadoop MapReduce. Spark has more natural 
> support for passing information through automatic closure serialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to