[ 
https://issues.apache.org/jira/browse/SPARK-18707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun closed SPARK-18707.
---------------------------------
    Resolution: Invalid

Sorry, but JIRA is not for questions.
It seems to me a usage question.
Please ask on `[email protected]`.

> Can spark  support exactly once based kafka ? Due to these following question?
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-18707
>                 URL: https://issues.apache.org/jira/browse/SPARK-18707
>             Project: Spark
>          Issue Type: Question
>            Reporter: hustfxj
>
> 1. If a task complete the operation, it will notify driver. The driver may 
> not receive the message due to the network, and think the task is still 
> running. Then the child stage won't be scheduled ?
> 2. how do spark guarantee the downstream-task  can receive the shuffle-data 
> completely. As fact, I can't find the checksum for blocks in spark. For 
> example, the upstream-task may shuffle 100Mb data, but the downstream-task 
> may receive 99Mb data due to network. Can spark verify the data is received 
> completely based size ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to