[
https://issues.apache.org/jira/browse/SPARK-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tathagata Das updated SPARK-4964:
---------------------------------
Description:
There are two issues with the current Kafka support
- Use of Write Ahead Logs in Spark Streaming to ensure no data is lost -
Causes data replication in both Kafka AND Spark Streaming.
- Lack of exactly-once semantics - For background, see
http://apache-spark-developers-list.1001551.n3.nabble.com/Which-committers-care-about-Kafka-td9827.html
We want to solve both these problem in JIRA. Please see the following design
doc for the solution.
https://docs.google.com/a/databricks.com/document/d/1IuvZhg9cOueTf1mq4qwc1fhPb5FVcaRLcyjrtG4XU1k/edit#heading=h.itproy77j3p
was:
for background, see
http://apache-spark-developers-list.1001551.n3.nabble.com/Which-committers-care-about-Kafka-td9827.html
There ar
> Exactly-once + WAL-free Kafka Support in Spark Streaming
> --------------------------------------------------------
>
> Key: SPARK-4964
> URL: https://issues.apache.org/jira/browse/SPARK-4964
> Project: Spark
> Issue Type: Improvement
> Components: Streaming
> Reporter: Cody Koeninger
>
> There are two issues with the current Kafka support
> - Use of Write Ahead Logs in Spark Streaming to ensure no data is lost -
> Causes data replication in both Kafka AND Spark Streaming.
> - Lack of exactly-once semantics - For background, see
> http://apache-spark-developers-list.1001551.n3.nabble.com/Which-committers-care-about-Kafka-td9827.html
> We want to solve both these problem in JIRA. Please see the following design
> doc for the solution.
> https://docs.google.com/a/databricks.com/document/d/1IuvZhg9cOueTf1mq4qwc1fhPb5FVcaRLcyjrtG4XU1k/edit#heading=h.itproy77j3p
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]