sivabalan narayanan created HUDI-8384:
-----------------------------------------

             Summary: Write functional tests for Cols stats partition 
                 Key: HUDI-8384
                 URL: https://issues.apache.org/jira/browse/HUDI-8384
             Project: Apache Hudi
          Issue Type: Improvement
          Components: metadata
            Reporter: sivabalan narayanan


 

We need to ensure that we cover the following cases for basic col stats 
certification:
 # insert few records validate. update the same and validate updates are 
reflected. repeat the updates and validate stats.
for MOR, trigger compaction and validate.
 # For MOR, let ensure we cover all log block types (data blocks, delete 
blocks, and rollback blocks) 
 # trigger clustering on top of 1 and validate stats. a. for MOR, lets trigger 
clustering before compaction and also after compaction. ensure that no stats 
are available for the replaced file groups.
 # insert few records, update. and delete subset of records which should impact 
the min and max values. validate.
 # lets add a test for async compaction and validate. i.e. some log files are 
added to new phantom file slice and stats are intact.
 # lets have a test for non partitioned table.
 # Trigger clean and ensure cleaned up files are deleted from col stats. Should 
not even return null stats. 
 # lets trigger rollbacks and validate. i.e. insert, update (partially failed). 
validate that only stats pertianing to inserts are reflected. trigger a 
rollback and validate its still the same. retry the updates. stats should 
reflect stats w/ updated records.
 # lets add one long running tests. i.e with 20+ commits and aggressive cleaner 
and archival. just for sanity. or if we can enable all kinds of index in an 
existing sanity tests, we should be good.
 # lets test all write operations. bulk_insert, insert, upsert, delete, 
insert_overwrite, insert_overwrite_table, delete_partition.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to