Dear community, Nice to share Hudi community bi-weekly updates for 2021-12-05 ~ 2021-12-19 with updates on features, bug fixes and tests.
======================================= Features [Core] Add a hudi-trino-bundle for Trino [1] [Core] Add a repair util to clean up dangling data and log files [2] [1] https://issues.apache.org/jira/browse/HUDI-2784 [2] https://issues.apache.org/jira/browse/HUDI-2906 ======================================= Bugs [Core] Fix corrupt block end position [1] [Core] for hive/presto hudi should remove the temp file which created by HoodieMergedLogRecordSanner when the query finished [2] [Core] Fixing aws lock configs to inherit from HoodieConfig [3] [Flink] Shade kryo jar for flink bundle jar [4] [Core] Fix overflow of huge log file in HoodieLogFormatWriter [5] [Core] Cache BaseDir if HudiTableNotFound Exception thrown [6] [Core] Add TaskCompletionListener for HoodieMergeOnReadRDD to close logScaner when the query finished [7] [Core] Fixed the bug clustering jobs cannot running in parallel [8] [Core] Improve SparkUI job description for write path [9] [Core] Fixing metadata table for non-partitioned dataset [10] [Core] Make Z-index more generic Column-Stats Index [11] [Core] Make the prefix for metrics name configurable [12] [Core] Implement #close for AbstractTableFileSystemView [13] [Build] Upgrade maven plugins to be compatible with higher Java versions [14] [Core] Metadata table util to get latest file slices for reader/writers [15] [Core] Sync to HMS when deleting partitions [16] [Core] Add judgement to existed partitionPath in the catch code block [17] [Flink] Flink streaming reader 'skip_compaction' option does not work [18] [Flink] Skip the corrupt meta file for pending rollback action [19] [Flink] Add explicit write handler for flink [20] [Core] Implement #reset and #sync for metadata filesystem view [21] [Core] lean up the marker directory when finish bootstrap operation [22] [Core] Automatically set spark.sql.parquet.writelegacyformat, when using bulkinsert to insert data which contains decimalType [23] [Core] InProcess lock provider to guard single writer process with async table operations [24] [Core] Transaction manager: avoid deadlock when doing begin and end transactions [25] [1] https://issues.apache.org/jira/browse/HUDI-2900 [2] https://issues.apache.org/jira/browse/HUDI-2876 [3] https://issues.apache.org/jira/browse/HUDI-2964 [4] https://issues.apache.org/jira/browse/HUDI-2957 [5] https://issues.apache.org/jira/browse/HUDI-2665 [6] https://issues.apache.org/jira/browse/HUDI-2779 [7] https://issues.apache.org/jira/browse/HUDI-2966 [8] https://issues.apache.org/jira/browse/HUDI-2901 [9] https://issues.apache.org/jira/browse/HUDI-2849 [10] https://issues.apache.org/jira/browse/HUDI-2952 [11] https://issues.apache.org/jira/browse/HUDI-2814 [12] https://issues.apache.org/jira/browse/HUDI-2974 [13] https://issues.apache.org/jira/browse/HUDI-2984 [14] https://issues.apache.org/jira/browse/HUDI-2946 [15] https://issues.apache.org/jira/browse/HUDI-2938 [16] https://issues.apache.org/jira/browse/HUDI-2990 [17] https://issues.apache.org/jira/browse/HUDI-2994 [18] https://issues.apache.org/jira/browse/HUDI-2996 [19] https://issues.apache.org/jira/browse/HUDI-2997 [20] https://issues.apache.org/jira/browse/HUDI-3024 [21] https://issues.apache.org/jira/browse/HUDI-3015 [22] https://issues.apache.org/jira/browse/HUDI-3001 [23] https://issues.apache.org/jira/browse/HUDI-2958 [24] https://issues.apache.org/jira/browse/HUDI-2962 [25] https://issues.apache.org/jira/browse/HUDI-3029 ====================================== Tests [Tests] Add data count checks in async clustering tests [1] [Tests] Multi writer test with conflicting async table services [2] [Tests] Adding some test fixes to continuous mode multi writer tests [3] [Tests] De-coupling multi writer tests [4] [Tests] Fixing a bug in TransactionManager and FileSystemTestLock [5] [Tests] Fixing default lock configs for FileSystemBasedLock and fixing a flaky test [6] [Tests] Fix flaky testJsonKafkaSourceResetStrategy [7] [Tests] Adding tests for archival of replace commit actions [8] [1] https://issues.apache.org/jira/browse/HUDI-2936 [2] https://issues.apache.org/jira/browse/HUDI-2527 [3] https://issues.apache.org/jira/browse/HUDI-3043 [4] https://issues.apache.org/jira/browse/HUDI-3043 [5] https://issues.apache.org/jira/browse/HUDI-3064 [6] https://issues.apache.org/jira/browse/HUDI-3054 [7] https://issues.apache.org/jira/browse/HUDI-3052 [8] https://issues.apache.org/jira/browse/HUDI-2970 Best, Leesf