sivabalan narayanan created HUDI-6561:
-----------------------------------------
Summary: Ensure there is no data duplication with spark streaming
writes
Key: HUDI-6561
URL: https://issues.apache.org/jira/browse/HUDI-6561
Project: Apache Hudi
Issue Type: Improvement
Components: spark
Reporter: sivabalan narayanan
w/ spark-streaming writes, we can deduce first batch using batchId vs an
existing batch which got resumed after a long long time.
we should guarantee idempotency by deducing the batch Id
--
This message was sent by Atlassian Jira
(v8.20.10#820010)