Dear community, Nice to share Hudi community bi-weekly updates for 2020-01-03 ~ 2021-01-17 with updates on features, bug fixes and tests.
======================================= Features [Metadata] Implementation of HUDI RFC-15 [1] [Metadata] Use metadata table for listing in HoodieROTablePathFilter [2] [Metadata] Faster initialization of metadata table using parallelized listing [3] [Metadata] Merge updates of unsynced instants to metadata table [4] [Metadata] Support for metadata listing for snapshot queries through Hive/SparkSQL [5] [Metadata] Allow log files generated during restore/rollback to be synced as well [6] [Metadata] Read clustering plan from requested file for inflight instant[7] [Client] Introduce WriteClient#preWrite() and relocate metadata table syncing [8] [Common] Move HoodieEngineContext and its dependencies to hudi-common [9] [Spark Integration] Support Incremental query for MOR table [10] [Metadata] Make Clustering/ReplaceCommit and Metadata table be compatible [11] [Clustering] support a independent clustering spark job to asynchronously clustering [12] [Core] Use HoodieEngineContext to parallelize fetching of partition paths [13] [Spark Integration] add configure for spark sql overwrite use INSERT_OVERWRITE_TABLE [14] [Metadata] MOR rollback and restore support for metadata sync [15] [1] https://issues.apache.org/jira/browse/HUDI-841 [2] https://issues.apache.org/jira/browse/HUDI-1450 [3] https://issues.apache.org/jira/browse/HUDI-1469 [4] https://issues.apache.org/jira/browse/HUDI-1325 [5] https://issues.apache.org/jira/browse/HUDI-1312 [6] https://issues.apache.org/jira/browse/HUDI-1504 [7] https://issues.apache.org/jira/browse/HUDI-1498 [8] https://issues.apache.org/jira/browse/HUDI-1513 [9] https://issues.apache.org/jira/browse/HUDI-1510 [10] https://issues.apache.org/jira/browse/HUDI-920 [11] https://issues.apache.org/jira/browse/HUDI-1459 [12] https://issues.apache.org/jira/browse/HUDI-1399 [13] https://issues.apache.org/jira/browse/HUDI-1479 [14] https://issues.apache.org/jira/browse/HUDI-1520 [15] https://issues.apache.org/jira/browse/HUDI-1502 ======================================= Bugs [Core] Fix wrong exception thrown in HoodieAvroUtils [1] [Hive Integration] Fixing sorting of partition vals for hive sync computation [2] [Metadata] Change timeline utils to support reading replacecommit metadata [3] [Core] Avoid raw type use for parameter of Transformer interface [4] [Hive Integration] Reverting LinkedHashSet changes to combine fields from oldSchema and newSchema in favor of using only new schema for record rewriting [5] [1] https://issues.apache.org/jira/browse/HUDI-1506 [2] https://issues.apache.org/jira/browse/HUDI-1485 [3] https://issues.apache.org/jira/browse/HUDI-1507 [4] https://issues.apache.org/jira/browse/HUDI-1514 [5] https://issues.apache.org/jira/browse/HUDI-1509 ======================================= Tests [Tests] fix test hbase index [1] [1] https://issues.apache.org/jira/browse/HUDI-1525 Best, Leesf
