sivabalan narayanan created HUDI-8384:
-----------------------------------------
Summary: Write functional tests for Cols stats partition
Key: HUDI-8384
URL: https://issues.apache.org/jira/browse/HUDI-8384
Project: Apache Hudi
Issue Type: Improvement
Components: metadata
Reporter: sivabalan narayanan
We need to ensure that we cover the following cases for basic col stats
certification:
# insert few records validate. update the same and validate updates are
reflected. repeat the updates and validate stats.
for MOR, trigger compaction and validate.
# For MOR, let ensure we cover all log block types (data blocks, delete
blocks, and rollback blocks)
# trigger clustering on top of 1 and validate stats. a. for MOR, lets trigger
clustering before compaction and also after compaction. ensure that no stats
are available for the replaced file groups.
# insert few records, update. and delete subset of records which should impact
the min and max values. validate.
# lets add a test for async compaction and validate. i.e. some log files are
added to new phantom file slice and stats are intact.
# lets have a test for non partitioned table.
# Trigger clean and ensure cleaned up files are deleted from col stats. Should
not even return null stats.
# lets trigger rollbacks and validate. i.e. insert, update (partially failed).
validate that only stats pertianing to inserts are reflected. trigger a
rollback and validate its still the same. retry the updates. stats should
reflect stats w/ updated records.
# lets add one long running tests. i.e with 20+ commits and aggressive cleaner
and archival. just for sanity. or if we can enable all kinds of index in an
existing sanity tests, we should be good.
# lets test all write operations. bulk_insert, insert, upsert, delete,
insert_overwrite, insert_overwrite_table, delete_partition.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)