Dear community, Nice to share Hudi community weekly update for 2020-04-19 ~ 2020-04-26 with updates on features, discussion, bug fix and tests, also with jiras help wanted [1], please feel free to pick up and PRs help wanted.
===================================== Jiras Help wanted(please pick up) [Writer Core] Remove Rolling Stat management from Hudi Writer [2] [Code Cleanup] Introduce abstraction for writing and reading and compacting from FileGroups [3] [Code Cleanup] Abstract/Refactor all transaction management logic into a set of classes [4] [Index] Introduce ability to compress bloom filters while storing in parquet [5] [DeltaStreamer] Implement support for bootstrapping in HoodieDeltaStreamer [6] ===================================== PRs Help wanted(please reivew) https://github.com/apache/incubator-hudi/pull/1518 https://github.com/apache/incubator-hudi/pull/1514 https://github.com/apache/incubator-hudi/pull/1524 https://github.com/apache/incubator-hudi/pull/1559 ===================================== Discussion [Docs] A discussion to move blog from cwiki to website, and reached an consuses [7] [Reader] A discussion to abstraction for HoodieInputFormat and RecordReader, gary also provided a RFC for the discussion [8] [Release] A discussion about the timeline of next release, sudha will be the release manager and will release a major release(0.6.0) [9] [Metrics] A discussion to make Hudi supports popular metrics reporter, such as datadog [10] [Thoughts] A discussion about doing a bug bash for a week to close out some pesky bugs [11] [Core] A discussion about the generic types of HoodieRecordPayload, and we need think more before migrate to explicit generic type since it is a user facing api [12] ===================================== Questions [spark Table Read fails in Spark Submit , Where as succeeds in spark-shell [13] ===================================== Features [Spark Integration] Make UserDefinedBulkInsertPartitioner configurable for DataSource [14] [Hive Integration] Supporting hive combine input format for realtime tables [15] [Utilities] Adjust logic of upsert in HDFSParquetImporter [16] [Hudi Cli] Add a command to hudi-cli to export commit metadata [17] [Code Cleanup] Refactor compaction/savepoint execution based on ActionExector abstraction [18] ===================================== Bugs [DeltaStreamer] Fixing JCommander param parsing in deltastreamer [19] [Writer Core] Fixed MAX_MEMORY_FOR_MERGE_PROP and MAX_MEMORY_FOR_COMPACTION_PROP do not work due to HUDI-678 [20] [Writer Core] Handle auto-deleted empty aux folder [21] ===================================== Tests [Tests] Migrate CommonTestHarness to JUnit 5 [22] [1] https://jira.apache.org/jira/browse/HUDI-760?jql=project%20%3D%20HUDI%20AND%20labels%20%3D%20help-wanted [2] https://jira.apache.org/jira/browse/HUDI-760 [3] https://jira.apache.org/jira/browse/HUDI-684 [4] https://jira.apache.org/jira/browse/HUDI-677 [5] https://jira.apache.org/jira/browse/HUDI-558 [6] https://jira.apache.org/jira/browse/HUDI-425 [7] https://lists.apache.org/thread.html/r6b86907773454f5120a5bb38b308feff6bd641dd1c85718263fdd645%40%3Cdev.hudi.apache.org%3E [8] https://lists.apache.org/thread.html/r97638198fe744d6427796d8275a36c309b2826da77a2f3bde2adb224%40%3Cdev.hudi.apache.org%3E [9] https://lists.apache.org/thread.html/raf0b2c8425a1cd95481da63db5f0e63c78e5f5e79aad847411644739%40%3Cdev.hudi.apache.org%3E [10] https://lists.apache.org/thread.html/re8f10d9f2ab8454598a1bedb0b483f3794fe852e9fb17a6f98ef351c%40%3Cdev.hudi.apache.org%3E [11] https://lists.apache.org/thread.html/r29654d5fd546c01302d23777b97230821f7f0402d9e8d17018e741a3%40%3Cdev.hudi.apache.org%3E [12] https://lists.apache.org/thread.html/r5cff5207f91d1f2fcc4c1c9b2440d1a9ea3c05787d807c7eacef22e0%40%3Cdev.hudi.apache.org%3E [13] https://lists.apache.org/thread.html/r331427aa0dd6894e18ed881db4b7040388e7800cfa67107f24c9e496%40%3Cdev.hudi.apache.org%3E [14] https://jira.apache.org/jira/browse/HUDI-772 [15] https://jira.apache.org/jira/browse/HUDI-371 [16] https://issues.apache.org/jira/browse/HUDI-789 [17] https://issues.apache.org/jira/browse/HUDI-757 [18] https://issues.apache.org/jira/browse/HUDI-785 [19] https://issues.apache.org/jira/browse/HUDI-821 [20] https://issues.apache.org/jira/browse/HUDI-816 [21] https://issues.apache.org/jira/browse/HUDI-795 [22] https://issues.apache.org/jira/browse/HUDI-809 Best, Leesf
