[
https://issues.apache.org/jira/browse/HUDI-7416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinish Reddy updated HUDI-7416:
-------------------------------
Description:
Introducing a new class known as {{StreamProfile}} which contains details about
how the next sync round in StreamSync should be consumed and written. For eg:
{{KafkaStreamProfile}} contains number of events to consume in this sync round.
{{S3StreamProfile}} contains the list of files to consume in this sync round
{{HudiIncrementalStreamProfile}} contains the beginInstant and endInstant
commit times to consume in this sync round.
In future we can add the method for choosing the writeOperationType and
indexType as well, for now {{streamProfile.getSourceSpecificContext() }}will be
used to consume the data from the source.
> Add interface for StreamProfile to be used in StreamSync for reading and
> writing data.
> ---------------------------------------------------------------------------------------
>
> Key: HUDI-7416
> URL: https://issues.apache.org/jira/browse/HUDI-7416
> Project: Apache Hudi
> Issue Type: Improvement
> Components: deltastreamer
> Reporter: Vinish Reddy
> Assignee: Vinish Reddy
> Priority: Major
> Labels: pull-request-available
>
> Introducing a new class known as {{StreamProfile}} which contains details
> about how the next sync round in StreamSync should be consumed and written.
> For eg:
> {{KafkaStreamProfile}} contains number of events to consume in this sync
> round.
> {{S3StreamProfile}} contains the list of files to consume in this sync round
> {{HudiIncrementalStreamProfile}} contains the beginInstant and endInstant
> commit times to consume in this sync round.
> In future we can add the method for choosing the writeOperationType and
> indexType as well, for now {{streamProfile.getSourceSpecificContext() }}will
> be used to consume the data from the source.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)