sivabalan narayanan created HUDI-5077:
-----------------------------------------
Summary: Supporting multiple deltastreamers writing to a single
hudi table
Key: HUDI-5077
URL: https://issues.apache.org/jira/browse/HUDI-5077
Project: Apache Hudi
Issue Type: Improvement
Components: deltastreamer
Reporter: sivabalan narayanan
As of now, we can only have a single deltastreamer write to a single hudi
table. we have an ask from the community to have 2 deltastreamers write to a
single table.
Things required to be fixed:
# we need to fix the checkpointing to have multiple key-value pairs, where key
represents a unique identifier for the deltastreamer client and value
represents the checkpoint. We might need to introduce a new notion of
identifier for each deltastreamer in this case.
# within delta sync, after writeClient.upsert, before calling
writeClient.commit, we need to update the checkpoint value. for this, we might
need to take a lock and then fetch latest checkpoint from timeline (since there
could be multiple wirters) and then update the checkpoint. and release the
lock.
These are the changes I can think of. may be while implementing it, there could
be some more minor fixes required.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)