Dear community, Nice to share Hudi community weekly update for 2020-01-05 ~ 2020-01-12 with updates on develpment, features, bug fixes.
Development [Terminologies simplification] A full version to introduce the design and architecture of HUDI has been written[1], and you are welcome to contribute. [JDBC Incremental Puller] A disscussion about introducing JDBC Delta Streamer to make HUDI more powerful[2] has been started. and a RFC[3] has been draft for comments. [New Website] The PR provided by lamberKen to introduce new hudi web site has been merged, you would check it out[4] and kindly feedback are welcome[5]. [Weekly update] A disscussion thread about giving a weekly update of hudi commnuity to expand the visibility of hudi. [Configuration refactor] A disscussion thread about refactoring the configuration framework of hudi is going to start [6]. [Release] A disscussion about the code freeze date(Jan 15) for next release (0.5.1) reached a consensus.[7] [1] https://cwiki.apache.org/confluence/display/HUDI/Design+And+Architecture [2] https://lists.apache.org/thread.html/r31b03a964c234e0903847ba60d9d7b340d0b59daa5232ae922a5b38d%40%3Cdev.hudi.apache.org%3E [3] https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller [4] https://hudi.apache.org/newsite-content/ [5] https://github.com/apache/incubator-hudi/issues/1196 [6] https://lists.apache.org/thread.html/1fd96c9ff258aa35c030d07b929fdc15c2ebe93b155e1067ff45259c%40%3Cdev.hudi.apache.org%3E [7] https://lists.apache.org/thread.html/r14291a41be93ff178f22faa292d5e2a09fc7c294b7d89216c132083a%40%3Cdev.hudi.apache.org%3E Features [DeltaStreamer] Adding Delete() support to DeltaStreamer[8] [Client] Refactor HoodieWriteClient so that commit logic can be shareable by both bootstrap and normal write operations[9] [Docs] Add a new maven profile to generate unified Javadoc for all Java and Scala classes[10] [Hive Integration] Optimize HoodieInputformat.listStatus() for faster Hive incremental queries on Hoodie[11] [Writer] added option to overwrite payload implementation in hoodie.properties file[12] [DeltaStreamer] Introduce Default partition path in TimestampBasedKeyGenerator[13] [Spark Integration] Replace Databricks spark-avro with native spark-avro[14] [Writer] Upgrade Hudi to Spark 2.4[15] [Utilities] Provide a custom time zone definition for TimestampBasedKeyGenerator[16] [8] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-377 [9] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-417 [10] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-319 [11] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-25 [12] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-114 [13] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-406 [14] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-91 [15] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-12 [16] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-502 Bugs [Incremental Pull] Fix NPE when reading IncrementalPull.sqltemplate in HiveIncrementalPuller[17] [CLI] HoodieCommitMetadata only show first commit insert rows[18] [CLI] CLI doesn't allow rolling back a Delta commit[19] [DeltaStreamer] DeltaSteamer should pick checkpoints off only deltacommits for MOR tables[20] [17] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-484 [18] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-469 [19] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-248 [20] https://issues.apache.org/jira/projects/HUDI/issues/HUDI-322
