Dear community, Nice to share Hudi community weekly update for 2020-04-12 ~ 2020-04-19 with updates on features, discussion, bug fix and tests, also with jiras help wanted [1], please feel free to pick up.
===================================== Help wanted [Writer Core] Remove Rolling Stat management from Hudi Writer [2] [Code Cleanup] Introduce abstraction for writing and reading and compacting from FileGroups [3] [Code Cleanup] Abstract/Refactor all transaction management logic into a set of classes [4] [Index] Introduce ability to compress bloom filters while storing in parquet [5] [DeltaStreamer] Implement support for bootstrapping in HoodieDeltaStreamer [6] ===================================== Discussion [Write Core] A discussion about insert Overwrite with snapshot isolation, it is planing to introduce a new api insertOverride, which is used to override the partions with new records instead of upsert [7] ===================================== Questions [Incremental Query] How to understand incremental query? [8] [Duplicate Writes] Manual deletion of a parquet file [9] [Concurrent Writes] Hudi concurrent writes [10] ===================================== Features [Utilities] Copy default values of fields if not present when rewriting incoming record with new schema [11] [Writer Core] Organize ingest API implementation under a single package [12] [Writer Core] Refactoring rollback and restore actions using the ActionExecutor abstraction [13] [Utilities] Integrate checkpoint provider with delta streamer [14] [Writer Core] Added checks to validate Hoodie's schema evolution [15] ===================================== Bugs [DeltaStreamer] Use appropriate FS when loading configs [16] ===================================== Tests [Tests] Add unit test for CleansCommand [17] [Tests] Migrate test cases to Junit 5 [18] [Tests] Migrate Mockito to work with Junit 5 [19] [1] https://jira.apache.org/jira/browse/HUDI-760?jql=project%20%3D%20HUDI%20AND%20labels%20%3D%20help-wanted [2] https://jira.apache.org/jira/browse/HUDI-760 [3] https://jira.apache.org/jira/browse/HUDI-684 [4] https://jira.apache.org/jira/browse/HUDI-677 [5] https://jira.apache.org/jira/browse/HUDI-558 [6] https://jira.apache.org/jira/browse/HUDI-425 [7] https://lists.apache.org/thread.html/rf854b5792b38608c96561d43cc3081d084f26564f7dae53e2e4e6922%40%3Cdev.hudi.apache.org%3E [8] https://lists.apache.org/thread.html/r49fd47ad2bfae0f4b017447230076778a84fd4aaa4898901431e4bd2%40%3Cdev.hudi.apache.org%3E [9] https://lists.apache.org/thread.html/r89d50c153578698830fb3bbef55d84fbc5c01c7d704184a2420d3e03%40%3Cdev.hudi.apache.org%3E [10] https://lists.apache.org/thread.html/re793f2178115348e31430a97cfe78aa0f1e1996fa887798ed148174b%40%3Cdev.hudi.apache.org%3E [11] https://jira.apache.org/jira/browse/HUDI-727 [12] https://jira.apache.org/jira/browse/HUDI-770 [13] https://jira.apache.org/jira/browse/HUDI-761 [14] https://jira.apache.org/jira/browse/HUDI-759 [15] https://jira.apache.org/jira/browse/HUDI-741 [16] https://jira.apache.org/jira/browse/HUDI-799 [17] https://jira.apache.org/jira/browse/HUDI-698 [18] https://jira.apache.org/jira/browse/HUDI-780 [19] https://jira.apache.org/jira/browse/HUDI-798 Best, Leesf
