[
https://issues.apache.org/jira/browse/HUDI-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866210#comment-17866210
]
Ethan Guo commented on HUDI-7971:
---------------------------------
We can leverage and extend bundle validation script and framework
([hudi/packaging/bundle-validation at master · apache/hudi ·
GitHub|https://github.com/apache/hudi/tree/master/packaging/bundle-validation])
to achieve the compatibility tests. These compatibility tests should be added
as GH CI workflow so every PR must run them.
> Test and Certify 0.14.x to 0.16.x tables are readable in 1.x Hudi reader
> -------------------------------------------------------------------------
>
> Key: HUDI-7971
> URL: https://issues.apache.org/jira/browse/HUDI-7971
> Project: Apache Hudi
> Issue Type: Sub-task
> Reporter: sivabalan narayanan
> Priority: Major
> Fix For: 1.0.0
>
>
> Lets ensure 1.x reader is fully compatible w/ reading any of 0.14.x to 0.16.x
> tables
>
> Readers : 1.x
> # Spark SQL
> # Spark Datasource
> # Trino/Presto
> # Hive
> # Flink
> Writer: 0.16
> Table State:
> * COW
> ** few write commits
> ** Pending clustering
> ** Completed Clustering
> ** Failed writes with no rollbacks
> ** Insert overwrite table/partition
> ** Savepoint for Time-travel query
> * MOR
> ** Same as COW
> ** Pending and completed async compaction (with log-files and no base file)
> ** Custom Payloads (for MOR snapshot queries) (e:g SQL Expression Payload)
> ** Log block formats - DELETE, rollback block
> Other knobs:
> # Metadata enabled/disabled (all combinations)
> # Column Stats enabled/disabled and data-skipping enabled/disabled
> # RLI enabled with eq/IN queries
> # Non-Partitioned dataset (all combinations)
> # CDC Reads
> # Incremental Reads
> # Time-travel query
>
> What to test ?
> # Query Results Correctness
> # Performance : See the benefit of
> # Partition Pruning
> # Metadata table - col stats, RLI,
>
> Corner Case Testing:
>
> # Schema Evolution with different file-groups having different generation of
> schema
> # Dynamic Partition Pruning
> # Does Column Projection work correctly for log files reading
--
This message was sent by Atlassian Jira
(v8.20.10#820010)