[
https://issues.apache.org/jira/browse/HUDI-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amrish Lal updated HUDI-6315:
-----------------------------
Description: For MIT, Update and Delete, we do a look up in hudi to find
matching records based o the predicates and then trigger the writes following
it. But the records fetched from hudi already contains all meta fields that is
required for key generation and index look up (like the record key, partition
path, filename, commit time). But as of now, we drop those meta fields and
trigger an upsert to hudi (as though someone is writing via spark-datasource).
This goes via regular code path of key generation and index lookup which is
unnecessary.
> Optimize UPSERT codepath to use meta fields instead of key generation and
> index lookup
> --------------------------------------------------------------------------------------
>
> Key: HUDI-6315
> URL: https://issues.apache.org/jira/browse/HUDI-6315
> Project: Apache Hudi
> Issue Type: New Feature
> Reporter: Amrish Lal
> Priority: Major
> Labels: pull-request-available
>
> For MIT, Update and Delete, we do a look up in hudi to find matching records
> based o the predicates and then trigger the writes following it. But the
> records fetched from hudi already contains all meta fields that is required
> for key generation and index look up (like the record key, partition path,
> filename, commit time). But as of now, we drop those meta fields and trigger
> an upsert to hudi (as though someone is writing via spark-datasource). This
> goes via regular code path of key generation and index lookup which is
> unnecessary.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)