[ 
https://issues.apache.org/jira/browse/HUDI-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Feng updated HUDI-4882:
----------------------------
    Description: 
we have 2 sources,  one target table
* source1's fields: *id, ts, name*
* source2's fields:* id, ts, price*
* target tables's fields:* id,ts,name, price*
ts is the precombine field;

in the 1st batch, we got two records from both sources:
   Source1:
       id      ts      name   
       1       1       name_1 
   Source 2:
       id      ts         price
       1       2          price_2
so the records in target table should be:
 id      ts         price
       1       2          price_2

This feature will allow users to perform partial updates across 
sub-tables/sources by determining the state of a set of columns in a row based 
on an ordering/precombine column.

As such, a table can have MULTIPLE ordering fields.

This use case is suitable for wide Hudi tables that are created from smaller 
sub-tables, where each of its sub-tables has its own precombine column, and 
where its records could be upserted out of order.
 !image-2022-09-20-22-46-52-907.png! 



  was:
let's say Stream or Table A
*  case 1
 *  Current data:
 *      id      ts1      name    price
 *      1       1       name_1  price_1
 *  Insert data:
 *      id      ts 2     name    price
 *      1       2       null    price_2

This feature will allow users to perform partial updates across 
sub-tables/sources by determining the state of a set of columns in a row based 
on an ordering/precombine column.

As such, a table can have MULTIPLE ordering fields.

This use case is suitable for wide Hudi tables that are created from smaller 
sub-tables, where each of its sub-tables has its own precombine column, and 
where its records could be upserted out of order.
 !image-2022-09-20-22-46-52-907.png! 




> Multiple ordering fields for partial update
> -------------------------------------------
>
>                 Key: HUDI-4882
>                 URL: https://issues.apache.org/jira/browse/HUDI-4882
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: Jian Feng
>            Priority: Major
>         Attachments: image-2022-09-20-22-42-19-445.png, 
> image-2022-09-20-22-46-52-907.png
>
>
> we have 2 sources,  one target table
> * source1's fields: *id, ts, name*
> * source2's fields:* id, ts, price*
> * target tables's fields:* id,ts,name, price*
> ts is the precombine field;
> in the 1st batch, we got two records from both sources:
>    Source1:
>        id      ts      name   
>        1       1       name_1 
>    Source 2:
>        id      ts         price
>        1       2          price_2
> so the records in target table should be:
>  id      ts         price
>        1       2          price_2
> This feature will allow users to perform partial updates across 
> sub-tables/sources by determining the state of a set of columns in a row 
> based on an ordering/precombine column.
> As such, a table can have MULTIPLE ordering fields.
> This use case is suitable for wide Hudi tables that are created from smaller 
> sub-tables, where each of its sub-tables has its own precombine column, and 
> where its records could be upserted out of order.
>  !image-2022-09-20-22-46-52-907.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to