Github user jose-torres commented on the issue:
https://github.com/apache/spark/pull/20710
I'm not certain I understand the question.
From the perspective of query plan execution, the non-continuous streaming
mode does just use the batch interface. The motivation of adding epoch ID to
DataWriterFactory is to allow it to continue using the batch interface, rather
than adding a StreamingDataWriterFactory which it must use instead of the batch
interface.
From the perspective of the writer, the batch interface is not sufficient.
Epoch ID is relevant to the data writer for the same reason that partition ID
is; they need to identify to the remote sink which segment of the overall data
they're responsible for.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]