[
https://issues.apache.org/jira/browse/SPARK-18707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun closed SPARK-18707.
---------------------------------
Resolution: Invalid
Sorry, but JIRA is not for questions.
It seems to me a usage question.
Please ask on `[email protected]`.
> Can spark support exactly once based kafka ? Due to these following question?
> ------------------------------------------------------------------------------
>
> Key: SPARK-18707
> URL: https://issues.apache.org/jira/browse/SPARK-18707
> Project: Spark
> Issue Type: Question
> Reporter: hustfxj
>
> 1. If a task complete the operation, it will notify driver. The driver may
> not receive the message due to the network, and think the task is still
> running. Then the child stage won't be scheduled ?
> 2. how do spark guarantee the downstream-task can receive the shuffle-data
> completely. As fact, I can't find the checksum for blocks in spark. For
> example, the upstream-task may shuffle 100Mb data, but the downstream-task
> may receive 99Mb data due to network. Can spark verify the data is received
> completely based size ?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]