[hudi] branch asf-site updated: [DOCS][MINOR] Fixing migration guide for 0.12.0 release page (#7330)

yihua Tue, 29 Nov 2022 12:00:37 -0800

This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new 667b1fb55e [DOCS][MINOR] Fixing migration guide for 0.12.0 release 
page (#7330)
667b1fb55e is described below

commit 667b1fb55e55b86b04b50035e7dedf4b3e8df6d0
Author: Sivabalan Narayanan <[email protected]>
AuthorDate: Tue Nov 29 12:00:00 2022 -0800

    [DOCS][MINOR] Fixing migration guide for 0.12.0 release page (#7330)
---
 website/releases/release-0.12.0.md | 110 ++++++++++++++++++-------------------
 1 file changed, 55 insertions(+), 55 deletions(-)

diff --git a/website/releases/release-0.12.0.md 
b/website/releases/release-0.12.0.md
index c76781d037..07daa59b2a 100644
--- a/website/releases/release-0.12.0.md
+++ b/website/releases/release-0.12.0.md
@@ -7,6 +7,61 @@ last_modified_at: 2022-08-17T10:30:00+05:30
 ---
 # [Release 0.12.0](https://github.com/apache/hudi/releases/tag/release-0.12.0) 
([docs](/docs/quick-start-guide))
 
+## Migration Guide
+
+In this release, there have been a few API and configuration updates listed 
below that warranted a new table version.
+Hence, the latest [table 
version](https://github.com/apache/hudi/blob/bf86efef719b7760ea379bfa08c537431eeee09a/hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableVersion.java#L41)
+is `5`. For existing Hudi tables on older version, a one-time upgrade step 
will be executed automatically. Please take
+note of the following updates before upgrading to Hudi 0.12.0.
+
+### Configuration Updates
+
+In this release, the default value for a few configurations have been changed. 
They are as follows:
+
+- `hoodie.bulkinsert.sort.mode`: This config is used to determine mode for 
sorting records for bulk insert. Its default value has been changed from 
`GLOBAL_SORT` to `NONE`, which means no sorting is done and it matches 
`spark.write.parquet()` in terms of overhead.
+- `hoodie.datasource.hive_sync.partition_value_extractor`: This config is used 
to extract and transform partition value during Hive sync. Its default value 
has been changed from `SlashEncodedDayPartitionValueExtractor` to 
`MultiPartKeysValueExtractor`. If you relied on the previous default value 
(i.e., have not set it explicitly), you are required to set the config to 
`org.apache.hudi.hive.SlashEncodedDayPartitionValueExtractor`. From this 
release, if this config is not set and Hive sync [...]
+- The following configs will be inferred, if not set manually, from other 
configs' values:
+    - `META_SYNC_BASE_FILE_FORMAT`: infer from 
`org.apache.hudi.common.table.HoodieTableConfig.BASE_FILE_FORMAT`
+
+    - `META_SYNC_ASSUME_DATE_PARTITION`: infer from 
`org.apache.hudi.common.config.HoodieMetadataConfig.ASSUME_DATE_PARTITIONING`
+
+    - `META_SYNC_DECODE_PARTITION`: infer from 
`org.apache.hudi.common.table.HoodieTableConfig.URL_ENCODE_PARTITIONING`
+
+    - `META_SYNC_USE_FILE_LISTING_FROM_METADATA`: infer from 
`org.apache.hudi.common.config.HoodieMetadataConfig.ENABLE`
+
+### API Updates
+
+In `SparkKeyGeneratorInterface`, return type of the `getRecordKey` API has 
been changed from String to UTF8String.
+```java
+// Before
+String getRecordKey(InternalRow row, StructType schema); 
+
+
+// After
+UTF8String getRecordKey(InternalRow row, StructType schema); 
+```
+
+### Fallback Partition
+
+If partition field value was null, Hudi has a fallback mechanism instead of 
failing the write. Until 0.9.0,
+`__HIVE_DEFAULT_PARTITION__`  was used as the fallback partition. After 0.9.0, 
due to some refactoring, fallback
+partition changed to `default`. This default partition does not sit well with 
some of the query engines. So, we are
+switching the fallback partition to `__HIVE_DEFAULT_PARTITION__`  from 0.12.0. 
We have added an upgrade step where in,
+we fail the upgrade if the existing Hudi table has a partition named 
`default`. Users are expected to rewrite the data
+in this partition to a partition named 
[\_\_HIVE_DEFAULT_PARTITION\_\_](https://github.com/apache/hudi/blob/0d0a4152cfd362185066519ae926ac4513c7a152/hudi-common/src/main/java/org/apache/hudi/common/util/PartitionPathEncodeUtils.java#L29).
+However, if you had intentionally named your partition as `default`, you can 
bypass this using the config `hoodie.skip.default.partition.validation`.
+
+### Bundle Updates
+
+- `hudi-aws-bundle` extracts away aws-related dependencies from 
hudi-utilities-bundle or hudi-spark-bundle. In order to use features such as 
Glue sync, Cloudwatch metrics reporter or DynamoDB lock provider, users need to 
provide hudi-aws-bundle jar along with hudi-utilities-bundle or 
hudi-spark-bundle jars.
+- Spark 3.3 support is added; users who are on Spark 3.3 can use 
`hudi-spark3.3-bundle` or `hudi-spark3-bundle` (legacy bundle name).
+- Spark 3.2 will continue to be supported via `hudi-spark3.2-bundle`.
+- Spark 3.1 will continue to be supported via `hudi-spark3.1-bundle`.
+- Spark 2.4 will continue to be supported via `hudi-spark2.4-bundle` or 
`hudi-spark-bundle` (legacy bundle name).
+- Flink 1.15 support is added; users who are on Flink 1.15 can use 
`hudi-flink1.15-bundle`.
+- Flink 1.14 will continue to be supported via `hudi-flink1.14-bundle`.
+- Flink 1.13 will continue to be supported via `hudi-flink1.13-bundle`.
+
 ## Release Highlights
 
 ### Presto-Hudi Connector
@@ -105,61 +160,6 @@ This version brings more improvements to make Hudi the 
most performant lake stor
 We recently benchmarked Hudi against TPC-DS workload.
 Please check out [our 
blog](/blog/2022/06/29/Apache-Hudi-vs-Delta-Lake-transparent-tpc-ds-lakehouse-performance-benchmarks)
 for more details.
 
-### Migration Guide
-
-In this release, there have been a few API and configuration updates listed 
below that warranted a new table version.
-Hence, the latest [table 
version](https://github.com/apache/hudi/blob/bf86efef719b7760ea379bfa08c537431eeee09a/hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableVersion.java#L41)
 
-is `5`. For existing Hudi tables on older version, a one-time upgrade step 
will be executed automatically. Please take 
-note of the following updates before upgrading to Hudi 0.12.0.
-
-#### Configuration Updates
-
-In this release, the default value for a few configurations have been changed. 
They are as follows:
-
-- `hoodie.bulkinsert.sort.mode`: This config is used to determine mode for 
sorting records for bulk insert. Its default value has been changed from 
`GLOBAL_SORT` to `NONE`, which means no sorting is done and it matches 
`spark.write.parquet()` in terms of overhead.
-- `hoodie.datasource.hive_sync.partition_value_extractor`: This config is used 
to extract and transform partition value during Hive sync. Its default value 
has been changed from `SlashEncodedDayPartitionValueExtractor` to 
`MultiPartKeysValueExtractor`. If you relied on the previous default value 
(i.e., have not set it explicitly), you are required to set the config to 
`org.apache.hudi.hive.SlashEncodedDayPartitionValueExtractor`. From this 
release, if this config is not set and Hive sync [...]
-- The following configs will be inferred, if not set manually, from other 
configs' values:
-  - `META_SYNC_BASE_FILE_FORMAT`: infer from 
`org.apache.hudi.common.table.HoodieTableConfig.BASE_FILE_FORMAT`
-
-  - `META_SYNC_ASSUME_DATE_PARTITION`: infer from 
`org.apache.hudi.common.config.HoodieMetadataConfig.ASSUME_DATE_PARTITIONING`
-
-  - `META_SYNC_DECODE_PARTITION`: infer from 
`org.apache.hudi.common.table.HoodieTableConfig.URL_ENCODE_PARTITIONING`
-
-  - `META_SYNC_USE_FILE_LISTING_FROM_METADATA`: infer from 
`org.apache.hudi.common.config.HoodieMetadataConfig.ENABLE`
-
-#### API Updates
-
-In `SparkKeyGeneratorInterface`, return type of the `getRecordKey` API has 
been changed from String to UTF8String.
-```java
-// Before
-String getRecordKey(InternalRow row, StructType schema); 
-
-
-// After
-UTF8String getRecordKey(InternalRow row, StructType schema); 
-```
-
-#### Fallback Partition
-
-If partition field value was null, Hudi has a fallback mechanism instead of 
failing the write. Until 0.9.0, 
-`__HIVE_DEFAULT_PARTITION__`  was used as the fallback partition. After 0.9.0, 
due to some refactoring, fallback 
-partition changed to `default`. This default partition does not sit well with 
some of the query engines. So, we are 
-switching the fallback partition to `__HIVE_DEFAULT_PARTITION__`  from 0.12.0. 
We have added an upgrade step where in, 
-we fail the upgrade if the existing Hudi table has a partition named 
`default`. Users are expected to rewrite the data 
-in this partition to a partition named 
[\_\_HIVE_DEFAULT_PARTITION\_\_](https://github.com/apache/hudi/blob/0d0a4152cfd362185066519ae926ac4513c7a152/hudi-common/src/main/java/org/apache/hudi/common/util/PartitionPathEncodeUtils.java#L29).
 
-However, if you had intentionally named your partition as `default`, you can 
bypass this using the config `hoodie.skip.default.partition.validation`.
-
-#### Bundle Updates
-
-- `hudi-aws-bundle` extracts away aws-related dependencies from 
hudi-utilities-bundle or hudi-spark-bundle. In order to use features such as 
Glue sync, Cloudwatch metrics reporter or DynamoDB lock provider, users need to 
provide hudi-aws-bundle jar along with hudi-utilities-bundle or 
hudi-spark-bundle jars.
-- Spark 3.3 support is added; users who are on Spark 3.3 can use 
`hudi-spark3.3-bundle` or `hudi-spark3-bundle` (legacy bundle name).
-- Spark 3.2 will continue to be supported via `hudi-spark3.2-bundle`.
-- Spark 3.1 will continue to be supported via `hudi-spark3.1-bundle`.
-- Spark 2.4 will continue to be supported via `hudi-spark2.4-bundle` or 
`hudi-spark-bundle` (legacy bundle name).
-- Flink 1.15 support is added; users who are on Flink 1.15 can use 
`hudi-flink1.15-bundle`.
-- Flink 1.14 will continue to be supported via `hudi-flink1.14-bundle`.
-- Flink 1.13 will continue to be supported via `hudi-flink1.13-bundle`.
-
 ## Known Regressions:
 
 We discovered a regression in Hudi 0.12 release related to Bloom

[hudi] branch asf-site updated: [DOCS][MINOR] Fixing migration guide for 0.12.0 release page (#7330)

Reply via email to