Sagar Sumit created HUDI-3844:
---------------------------------
Summary: HoodieIndexer should set existing MDT partitions in props
if not already set by user
Key: HUDI-3844
URL: https://issues.apache.org/jira/browse/HUDI-3844
Project: Apache Hudi
Issue Type: Bug
Reporter: Sagar Sumit
Assignee: Sagar Sumit
Currently, the indexer assumes that only those partitions that are set by user
(in props passed to the indexer) are enabled and then goes ahead and deletes
the other partitions (except FILES) while fetching metadata writer.
For instance, let's say ingestion writer had metadata enabled (and hence the
FILES partition) and also BLOOM_FILTERS index was enabled. Do some commits and
it will as usual create files and bloom_filters partition. Now, user wants to
create COLUMN_STATS index using the indexer and hence enabled metadata and
column_stats index in props passed to the indexer. In this scenario, indexer
will presume that only files and column_stats are enabled, while bloom_filters
is disabled and the call table.getMetadataWriter() will think bloom_filters
needs to be removed, which is wrong.
Indexer should not presume which indexes (or MDT partitions) are disabled.
Instead, it should update its props based on table config. If a partition
exists due to regular writers, it should not delete that partition.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)