[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.

jose-torres Mon, 05 Mar 2018 16:27:51 -0800

Github user jose-torres commented on the issue:

    https://github.com/apache/spark/pull/20710
  
    I'm not certain I understand the question.
    
    From the perspective of query plan execution, the non-continuous streaming 
mode does just use the batch interface. The motivation of adding epoch ID to 
DataWriterFactory is to allow it to continue using the batch interface, rather 
than adding a StreamingDataWriterFactory which it must use instead of the batch 
interface.
    
    From the perspective of the writer, the batch interface is not sufficient. 
Epoch ID is relevant to the data writer for the same reason that partition ID 
is; they need to identify to the remote sink which segment of the overall data 
they're responsible for.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.

Reply via email to