[ 
https://issues.apache.org/jira/browse/HUDI-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amrish Lal updated HUDI-6315:
-----------------------------
    Description: For MIT, Update and Delete, we do a look up in hudi to find 
matching records based o the predicates and then trigger the writes following 
it. But the records fetched from hudi already contains all meta fields that is 
required for key generation and index look up (like the record key, partition 
path, filename, commit time). But as of now, we drop those meta fields and 
trigger an upsert to hudi (as though someone is writing via spark-datasource). 
This goes via regular code path of key generation and index lookup which is 
unnecessary. 

> Optimize UPSERT codepath to use meta fields instead of key generation and 
> index lookup
> --------------------------------------------------------------------------------------
>
>                 Key: HUDI-6315
>                 URL: https://issues.apache.org/jira/browse/HUDI-6315
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: Amrish Lal
>            Priority: Major
>              Labels: pull-request-available
>
> For MIT, Update and Delete, we do a look up in hudi to find matching records 
> based o the predicates and then trigger the writes following it. But the 
> records fetched from hudi already contains all meta fields that is required 
> for key generation and index look up (like the record key, partition path, 
> filename, commit time). But as of now, we drop those meta fields and trigger 
> an upsert to hudi (as though someone is writing via spark-datasource). This 
> goes via regular code path of key generation and index lookup which is 
> unnecessary. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to