sivabalan narayanan created HUDI-6460:
-----------------------------------------
Summary: Fix Hbase Index for deletes
Key: HUDI-6460
URL: https://issues.apache.org/jira/browse/HUDI-6460
Project: Apache Hudi
Issue Type: Improvement
Components: index
Reporter: sivabalan narayanan
With adding delete support for RLI,
[https://github.com/apache/hudi/pull/9058/files]
Hbase index needs some fixes.
Test that is failing is:
TestSparkHoodieHBaseIndex.
testTagLocationAndPartitionPathUpdateWithExplicitRollback
Root cause:
when update partition path is set to true, within same batch we have a deleted
record and a new insert record. So, to hbase we are sending both the records
and for some inserts take precedence, while for others deletes take precedence.
we need to fix SparkHoodieHbaseIndex.
updateLocation
to do one pass overWriteStatus and ensure we de-dup if we have two records
where one of them is deleted and another is inserted.
but there are chances that only deletes are present, so in such cases, we need
to ensure deletes are routed to hbase.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)