[ 
https://issues.apache.org/jira/browse/HUDI-8916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Y Ethan Guo updated HUDI-8916:
------------------------------
    Description: When doing prepped upsert flow for SQL UPDATE and DELETE, we 
use snapshot read to get the meta columns and decide the instant time and 
current location of the records.  The instant time stored in the current 
location for the HoodieRecord might not be the base file instant time, if the 
records have updates in log files (in snapshot read, merging can pick record 
and commit time from the log file).  This is in contrary to the assumption that 
the instant time stored should be base file instant time from indexing in 
Spark.  (was: When doing )

> Return base instant time in prepped upsert flow for SQL UPDATE and DELETE
> -------------------------------------------------------------------------
>
>                 Key: HUDI-8916
>                 URL: https://issues.apache.org/jira/browse/HUDI-8916
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: Y Ethan Guo
>            Priority: Blocker
>             Fix For: 1.0.1
>
>   Original Estimate: 16h
>  Remaining Estimate: 16h
>
> When doing prepped upsert flow for SQL UPDATE and DELETE, we use snapshot 
> read to get the meta columns and decide the instant time and current location 
> of the records.  The instant time stored in the current location for the 
> HoodieRecord might not be the base file instant time, if the records have 
> updates in log files (in snapshot read, merging can pick record and commit 
> time from the log file).  This is in contrary to the assumption that the 
> instant time stored should be base file instant time from indexing in Spark.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to