[ 
https://issues.apache.org/jira/browse/HUDI-8401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Madan updated HUDI-8401:
------------------------------
    Labels: story  (was: )

> Support CDC payload, partial updates, and custom logic through native record 
> merge API implementation
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-8401
>                 URL: https://issues.apache.org/jira/browse/HUDI-8401
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: Y Ethan Guo
>            Assignee: Y Ethan Guo
>            Priority: Blocker
>              Labels: story
>             Fix For: 1.1.0, 1.0.2
>
>
> With the move towards making partial updates a first class citizen, that does 
> not need any special payloads/merges, we need to move the CDC payloads to all 
> be transformers in Hudi Streamer and SQL write path. Along with migration 
> instructions to users. 
>  # partial update has been implemented for Spark SQL source as follows:
>  ## Configuration \{{ hoodie.write.partial.update.schema }} is used for 
> partial update.
>  ## {{ExpressionPayload}} creates the writer schema based on the 
> configuration.
>  ## {{HoodieAppendHandle}} creates the log file based on the confgiuration 
> and the corresponding partial schema.
>  ## Currently this handle assumes these records are all update records.
>  ## We need to understand if ExpressionPayload/SQL Merger is needed to going 
> forward. 
>  # For DeltaStreamer, our goal is to remove all silo CDC payloads, e.g., 
> Debezium or AWSDMS, and to provide CDC data as {{InternalRow}} type. 
> Therefore,
>  ## The {{transformer}} in DeltaStreamer prepares the data according to the 
> types of the sources.
>  ## Initially, its okay to just support full row updates/deletes/... 
>  # Audit all of them should properly combine I/U/D into data and delete 
> blocks, such that U after D, D after U scenarios are handled as expected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to