bvaradar commented on a change in pull request #1277: [WIP][HUDI-543] release notes for 0.5.1 URL: https://github.com/apache/incubator-hudi/pull/1277#discussion_r370947587
########## File path: docs/_pages/releases.md ########## @@ -6,6 +6,47 @@ toc: true last_modified_at: 2019-12-30T15:59:57-04:00 --- +## [Release 0.5.1-incubating] + +### Download Information + * Source Release : [Apache Hudi(incubating) 0.5.1-incubating Source Release](https://www.apache.org/dist/incubator/hudi/0.5.1-incubating/hudi-0.5.1-incubating.src.tgz) ([asc](https://www.apache.org/dist/incubator/hudi/0.5.1-incubating/hudi-0.5.1-incubating.src.tgz.asc), [sha512](https://www.apache.org/dist/incubator/hudi/0.5.1-incubating/hudi-0.5.1-incubating.src.tgz.sha512)) + * Apache Hudi (incubating) jars corresponding to this release is available [here](https://repository.apache.org/#nexus-search;quick~hudi) + +### Release Highlights +* Dependency Version Upgrades + * Upgrade from Spark 2.1.0 to Spark 2.4.4 + * Upgrade from Avro 1.7.7 to Avro 1.8.2 + * Upgrade from Parquet 1.8.1 to Parquet 1.10.1 +* **IMPORTANT** This version requires your runtime spark version to be upgraded to 2.4+. +* Hudi now supports both Scala 2.11 and Scala 2.12, please refer to [Build with Scala 2.12](https://github.com/apache/incubator-hudi#build-with-scala-212) to build with Scala 2.12. +Also, the packages hudi-spark, hudi-utilities, hudi-spark-bundle and hudi-utilities-bundle are changed correspondingly to hudi-spark_{scala_version}, hudi-spark_{scala_version}, hudi-utilities_{scala_version}, hudi-spark-bundle_{scala_version} and hudi-utilities-bundle_{scala_version}. +Note that scala_version here is one of (2.11, 2.12). +* With 0.5.1, we added functionality to stop using renames for Hudi timeline metadata operations. This feature is automatically enabled for newly created Hudi tables. For existing tables, this feature is turned off by default. Please read this [section](deployment_link), before enabling this feature for existing hudi table. +To enable the new hudi timeline layout which avoids renames, use the write config "hoodie.timeline.layout.version=1". Alternatively, you can append the line "hoodie.timeline.layout.version=1" to hoodie.properties. Note that in any case, upgrade hudi readers (query engines) first with 0.5.1-incubating release before upgrading writer. +* CLI supports `repair overwrite-hoodie-props` to overwrite the table's hoodie.properties with specified file. +* DeltaStreamer CLI parameter for capturing table type is changed from --storage-type to --table-type. Refer to [wiki](https://cwiki.apache.org/confluence/display/HUDI/Design+And+Architecture) with more latest terminologies. +* Configuration Value change for Kafka Reset Offset Strategies. Enum values are changed from LARGEST to LATEST, SMALLEST to EARLIEST for configuring Kafka reset offset strategies with configuration(auto.offset.reset) in deltastreamer. +* When using spark-shell to give a quick peek at Hudi, please provide --packages org.apache.spark:spark-avro_2.11:2.4.4, more details would refer to [latest quickstart docs](https://hudi.apache.org/docs/quick-start-guide.html) +* Key generator moved to separate package under org.apache.hudi.keygen. If you are using overridden key generator classes (configuration ("hoodie.datasource.write.keygenerator.class")) that comes with hudi package, please make change the fully qualified class name is changed accordingly. +* Hive Sync tool will register RO tables for MOR with a _ro suffix, so query with _ro suffix. You would use `--skip-ro-suffix` in sync config to control suffix. +* With 0.5.1, hudi-hadoop-mr-bundle which is used by query engines such as presto and hive includes shaded avro package to support hudi real time queries through these engines. Hudi supports pluggable logic for merging of records. Users provide their own implementation of [HoodieRecordPayload](https://github.com/apache/incubator-hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecordPayload.java). +If you are using this feature, you need to relocate the avro dependencies in your custom record payload class to be consistent with internal hudi shading. You need to add the following relocation when shading the package containing the record payload implementation. + + ```xml +<relocation> + <pattern>org.apache.avro.</pattern> + <shadedPattern>org.apache.hudi.org.apache.avro.</shadedPattern> +</relocation> + ``` + + * Better delete support in DeltaStreamer would refer to [latest quickstart docs](https://hudi.apache.org/docs/quick-start-guide.html) + * Support for AWS Database Migration Service(DMS) in DeltaStreamer + * Support for DynamicBloomFilter. + * Support option to overwrite payload implementation in hoodie.properties file. Review comment: This is already covered in the point above. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
