sivabalan narayanan created HUDI-6460:
-----------------------------------------

             Summary: Fix Hbase Index for deletes
                 Key: HUDI-6460
                 URL: https://issues.apache.org/jira/browse/HUDI-6460
             Project: Apache Hudi
          Issue Type: Improvement
          Components: index
            Reporter: sivabalan narayanan


With  adding delete support for RLI, 
[https://github.com/apache/hudi/pull/9058/files] 

Hbase index needs some fixes. 

Test that is failing is:

TestSparkHoodieHBaseIndex.

testTagLocationAndPartitionPathUpdateWithExplicitRollback

 

Root cause:

when update partition path is set to true, within same batch we have a deleted 
record and a new insert record. So, to hbase we are sending both the records 
and for some inserts take precedence, while for others deletes take precedence. 

 

we need to fix SparkHoodieHbaseIndex.

updateLocation

to do one pass overWriteStatus and ensure we de-dup if we have two records 
where one of them is deleted and another is inserted. 

but there are chances that only deletes are present, so in such cases, we need 
to ensure deletes are routed to hbase. 

 

 

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to