[
https://issues.apache.org/jira/browse/HUDI-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rajesh Mahindra updated HUDI-2814:
----------------------------------
Fix Version/s: (was: 0.10.0)
0.11.0
> Address issues w/ Z-order Layout Optimization
> ---------------------------------------------
>
> Key: HUDI-2814
> URL: https://issues.apache.org/jira/browse/HUDI-2814
> Project: Apache Hudi
> Issue Type: Sub-task
> Components: Index
> Reporter: Alexey Kudinkin
> Assignee: Alexey Kudinkin
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 0.11.0
>
>
> During extensive testing following issues have been discovered, which we're
> planning to addres in the upcoming PR:
> * Data-skipping seq incorrectly handles cases when columns that are not
> Z-sorted are present in the query (it simply ignores this fact, while it
> should abandon pruning altogether[1])
> * Exception w/in file-pruning seq should not be affecting overall query (it
> should in the worst case fallback to full-scan)
> * Merging seq prefers records from the old Z-index table, while should
> prefer those from the new one.
> * After clustering columns change, Z-index should simply overwrite index
> (currently it actually does the opposite – it skips updating the index in
> case old and new tables diverge in schemas)
> * Incorrect type conversions (for ex, Decimal is converted to Double)
> Additionally we're planning to beef up current Z-index implementation
> test-suite making sure that all critical flows of the Z-indexing have
> appropriate coverage.
> [1] Actually, with more advanced analysis we could still prune the search
> space, but this requires substantial sophistication of the analysis
> conducted, which is beyond our current focus
--
This message was sent by Atlassian Jira
(v8.20.1#820001)