This is an automated email from the ASF dual-hosted git repository. lzljs3620320 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/paimon-website.git
The following commit(s) were added to refs/heads/master by this push: new 5f22f9b3f Add release note for 0.9 5f22f9b3f is described below commit 5f22f9b3ff0a29c3cd2d66fac16e7d6857fad5a8 Author: Jingsong <jingsongl...@gmail.com> AuthorDate: Fri Sep 13 19:17:27 2024 +0800 Add release note for 0.9 --- main/template/nav.html | 2 +- pages/content/releases/release-0.9.md | 331 ++++++++++++++++++++++++++++++++++ 2 files changed, 332 insertions(+), 1 deletion(-) diff --git a/main/template/nav.html b/main/template/nav.html index fdade3090..08948081a 100644 --- a/main/template/nav.html +++ b/main/template/nav.html @@ -19,7 +19,7 @@ <a class="nav-link" href="https://paimon.apache.org/downloads.html">Downloads</a> </li> <li class="nav-item active px-3"> - <a class="nav-link" href="https://paimon.apache.org/release-0.8.2.html">Releases</a> + <a class="nav-link" href="https://paimon.apache.org/release-0.9.html">Releases</a> </li> <li class="nav-item active px-3"> <a class="nav-link" href="https://github.com/apache/paimon/">Github</a> diff --git a/pages/content/releases/release-0.9.md b/pages/content/releases/release-0.9.md new file mode 100644 index 000000000..728a48bcb --- /dev/null +++ b/pages/content/releases/release-0.9.md @@ -0,0 +1,331 @@ +--- +title: "Release 0.9" +weight: 992 +type: docs +aliases: +- /release-0.9.html +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Apache Paimon 0.9 Available + +Sep 13, 2024 - Jingsong Lee (jingsongl...@gmail.com) + +The Apache Paimon PMC officially announces the release of Apache Paimon 0.9.0. This version was developed over the +course of four months, with the participation of 80 contributors, resulting in more than 600 commits. + +Thank you to all contributors for your support! + +The community has decided that the next version will be 1.0, marking that most functionalities of Apache Paimon have +become relatively mature and stable. + +## Version Overview + +Paimon's long-term plan is to become a unified lake storage format that meets the main requirements for minute level +big data: offline batch computing, real-time stream computing, and OLAP computing. + +Notable changes in this version include: + +1. Paimon Branch: The branch functionality is now officially production-ready, and the introduction of the 'scan.fallback-branch' + feature helps to better unify stream and batch storage for businesses. +2. Universal Format: This version introduces native Iceberg compatibility. You can enable Iceberg compatibility mode, and + Paimon will generate Iceberg-compatible snapshots in real-time, allowing you to use the Iceberg ecosystem to read this Paimon table. +3. Caching Catalog: This version introduces the implementation of Caching Catalog by default. Table metadata and manifest + files will be cached in the catalog, which can accelerate OLAP query performance. +4. Improvements in Bucketed Append Table Availability: The small file issue has been significantly alleviated, and it + can be applied to bucketed joins in Spark, reducing shuffles during joins. +5. Support for DELETE & UPDATE & MERGEINTO in Append Tables: This version introduces support for DELETE, UPDATE, and + MERGE INTO in append tables. You can modify and delete records in append tables using Spark SQL, and it also supports + Deletion Vectors mode. + +## Compatibility Changes + +The following changes may impact the compatibility of your usage. + +### Bucketed Append Tables + +When defining a table without a primary key, if the number of 'bucket' is defined, the table is considered a Bucketed +Append table, previously referred to as an Append Queue table, as it is more commonly used in ordered stream writes and +reads. The issue of small files has been significantly alleviated, and it can be applied in Spark for Bucketed Joins, +reducing shuffles during joins. + +Here are some changes to its default values: + +1. Bucketed Append tables are prohibited from being defined without a bucket-key. The behavior in previous versions was + to hash the entire row to determine the corresponding bucket, which is an unintuitive behavior. We recommend using + an older version of Paimon to refresh the data (write into a new valid table). +2. The default value of the 'compaction.max.file-num' option for Bucketed Append tables has been adjusted to 5, meaning + there will be fewer small files within a single bucket to avoid excessive small files affecting production usability. + +Despite this, we still recommend that you avoid defining Bucketed Append tables unless necessary; the default bucket -1 +mode is more user-friendly. + +### File Format and Compression + +The Paimon community is focused on improving overall performance under default options. The following option default +values have been modified in version 0.9: + +1. File Format 'file.format': The default has changed from ORC to Parquet. There is no essential difference between these + formats, but Parquet generally performs better, and the community has completed all capabilities for Parquet, including + support for nested types, Filter PushDown, and more. +2. File Size 'target-file-size': The size for primary key tables remains at 128MB, while the default for non-primary key + tables (Append tables) has been adjusted to 256MB. +3. Compression Default 'file.compression': The default has changed from LZ4 to ZSTD, with the default ZSTD compression + level set to 1. You can adjust the compression level using 'file.compression.zstd-level', consuming more CPU for + greater compression rates. +4. Local Spill Compression Level 'spill-compression.zstd-level': Likewise, local spill can also achieve greater + compression rates by adjusting the level. + +### CDC Ingestion + +The dependency for Flink CDC has been upgraded to version 3.1. Since Flink CDC has become a sub-project of Flink in this +version, the package names have been modified, rendering older versions of CDC unsupported. MySQL CDC, MongoDB CDC, and +Postgres CDC will be affected. + +## Paimon Branch + +Branching is an interesting feature that allows us to manipulate Paimon tables in a manner similar to Git. It has +reached a production-ready state in Paimon 0.9, and Alibaba's internal teams are already using it in production +environments for tasks such as data correction and stream-batch integration. + +For example, you can use branches for data correction: + +```sql +-- create branch named 'branch1' from tag 'tag1' +CALL sys.create_branch('default.T', 'branch1', 'tag1'); + +-- write to branch 'branch1' +INSERT INTO `t$branch_branch1` SELECT ... + +-- read from branch 'branch1' +SELECT * FROM `t$branch_branch1`; + +-- replace master branch with 'branch1' +CALL sys.fast_forward('default.T', 'branch1'); +``` + +You can also use branches for unified stream-batch storage. You can set up a separate stream branch and then configure +'scan.fallback-branch'. This way, when a batch processing job reads from the current branch and a partition is missing, +it will attempt to read that partition from the fallback branch. + +Suppose you create a Paimon table partitioned by date. You have a long-running streaming job that inserts records into +Paimon so that today's data can be queried in a timely manner. You also have a nightly batch processing job that overwrites +the partitions in Paimon to ensure data accuracy. When you query this Paimon table, you want to read first from the results +of the batch processing job. However, if a specific partition (for example, today's partition) is missing in its results, +you want to read from the results of the streaming job. In this case, you can create a branch for the streaming job and +set 'scan.fallback-branch' to that streaming branch. + +```sql +-- create a branch for streaming job (realtime) +CALL sys.create_branch('default.T', 'rt'); + +-- set primary key and bucket number for the branch +ALTER TABLE `T$branch_rt` SET ( + 'primary-key' = 'dt,name', + 'bucket' = '2', + 'changelog-producer' = 'lookup' +); + +-- set fallback branch +ALTER TABLE T SET ( + 'scan.fallback-branch' = 'rt' +); + +SELECT * FROM T; +``` + +## Universal Format + +Paimon's Universal Format allows you to use Iceberg clients or compute engines to read data within Paimon. By using the +'metadata.iceberg-compatible' option, Paimon automatically generates Iceberg snapshots in the filesystem when creating +snapshots, without requiring any additional dependencies or concerns about governance-related issues. + +Notable points include: + +1. Iceberg metadata is stored in the file system (corresponding to Iceberg's HadoopCatalog). For example, you can read + it using Spark DataFrame: spark.read.format("iceberg").load("path"). +2. Iceberg views are read-only, and writing through this method may corrupt the table. +3. For primary key tables, Iceberg views can only access the highest level (LSM Level) files. You can configure + 'compaction.optimization-interval' to control the visibility of the data. + +## Caching Catalog + +Paimon's metadata is stored in the filesystem, which leads to frequent access to the filesystem during the planning +phase in compute engines, potentially impacting single-point performance, especially in object storage where this cost +is even higher. + +This version introduces the implementation of Caching Catalog by default, which will be enabled automatically +(it only caches manifest files smaller than 1MB by default). This can accelerate the performance of OLAP queries. + +You can control the behavior of the cache using the following options: + +| option | default | description | +|-------------------------------------|---------|---------------------------------------------------------------------------------| +| cache-enabled | true | Controls whether the catalog will cache databases, tables and manifests. | +| cache.expiration-interval | 1 min | Controls the duration for which databases and tables in the catalog are cached. | +| cache.manifest.max-memory | (none) | Controls the maximum memory to cache manifest content. | +| cache.manifest.small-file-memory | 128 mb | Controls the cache memory to cache small manifest files. | +| cache.manifest.small-file-threshold | 1 mb | Controls the threshold of small manifest file. | + +## Deletion Vectors + +The Deletion Vectors mode is fully available in version 0.9. + +For primary key tables, the Deletion Vectors mode now supports asynchronous compaction (defaulting to semi-synchronous), +significantly enhancing its usability without heavily impacting checkpoints. Since the DV mode requires local disk usage, +we recommend using SSDs for local disks; performance can be quite poor with lower-quality HDDs. + +For non-primary key tables (Append tables), version 0.9 supports DELETE, UPDATE, and MERGE INTO operations via Spark SQL, +making Paimon append tables resemble complete database tables, enabling fine-grained modifications and deletions for users. + +Moreover, non-primary key tables also support the Deletion Vectors mode. Before enabling it, deletions and modifications +are done using copy-on-write; once the DV mode is enabled, deletions and modifications switch to merge-on-write. +Deletion files will be removed during compaction. + +## Core + +### New Aggregation Functions + +The new aggregation functions include hll_sketch, theta_sketch, rbm32, and rbm64. You can use these sketch-related +functions to estimate COUNT DISTINCT. + +Paimon does not support custom aggregation functions but encourages you to propose enhancements to the built-in function library in the community. + +### Universal File Index + +The universal file index now supports the Bitmap type and also supports the REWRITE CALL command, allowing you to regenerate the corresponding index. + +Bitmap indexes perform well under joint filter conditions across multiple fields. + +### Historical Partition Compact + +Additionally, if your table is a partitioned table, although Paimon has built-in automatic compaction, its historical +partitions may not have undergone full compaction. In version 0.9, we introduced partition_idle_time to automatically +select partitions that haven't been updated for full compaction, reducing small files and improving query performance. + +## Flink + +### Cluster + +Clustering allows you to organize data in an Append table based on the values of certain columns during the write process. +This data organization method can significantly improve the efficiency of downstream tasks when reading data, as it enables +faster and more targeted data queries. This feature only supports Append tables (with bucket = -1) and batch execution mode. + +```sql +INSERT INTO my_table /*+ OPTIONS('sink.clustering.by-columns' = 'a,b') */ SELECT * FROM source; +``` + +### Partition Mark Done + +For partitioned tables, each partition may need to be scheduled to trigger downstream batch computations. Therefore, +it is essential to choose the right timing to indicate that a partition is ready for scheduling, while minimizing +data drift during the scheduling process. We refer to this process as "Marking a Partition as Done." + +```sql +CREATE TABLE my_partitioned_table ( + f0 INT, + f1 INT, + f2 INT, + ... + dt STRING +) PARTITIONED BY (dt) WITH ( + 'partition.timestamp-formatter'='yyyyMMdd', + 'partition.timestamp-pattern'='$dt', + 'partition.time-interval'='1 d', + 'partition.idle-time-to-done'='15 m' +); +``` + +1. First, it is necessary to define the time parsing for the partition and the time interval between partitions to + determine when it is appropriate to mark a partition as complete. +2. Second, idle time needs to be defined, which determines how long a partition must wait without new data before it + can be marked as complete. +3. Third, by default, marking a partition as complete will create a _SUCCESS file. You can also configure + partition.mark-done-action to define specific actions. + +### Table Clone + +In Paimon 0.9, the clone table action is supported for data migration. Currently, it only clones the table files used +by the latest snapshot. If the table you are cloning hasn’t been modified during this period, it is recommended to +submit a Flink batch job for better performance. However, if you wish to clone the table while writing, you should +submit a Flink stream processing job for automatic fault recovery. This command can assist you with convenient data +backup and migration. + +### Procedures + +Paimon 0.9 introduces a large number of procedures. Additionally, Flink procedures now support the latest version of +named procedures, making your execution more convenient without the need to specify all parameters forcefully. + +## Spark + +### Dynamic Options + +Historically, Spark SQL lacked the capability for dynamic parameters, while Flink SQL offers dynamic options that are +very convenient. In version 0.9, Spark introduces this capability through the SET command. The SET command allows for +specific Paimon configurations by requiring the prefix spark.paimon. to be added. + +```sql +-- set paimon conf +SET spark.paimon.file.block-size=512M; + +-- reset conf +RESET spark.paimon.file.block-size; +``` + +### Bucketed Join + +Hive has a feature called Bucketed Join. When two tables are bucketed by the same column, a Bucketed Join can be +performed without the need for shuffling the data; the join can be executed directly on the buckets, making it very efficient. + +Buckets are one of the core concepts in Paimon. Previously, Paimon lacked integration with compute engines, preventing +the utilization of this feature for optimization. + +In version 0.9, Paimon achieves deep integration with Spark SQL, enabling this optimization: + +- Optimized Execution: By leveraging the bucket structure during join operations, Paimon can significantly reduce + the overhead of data shuffling, leading to better performance on join queries. +- Seamless Configuration: Users can easily define buckets when creating tables, allowing for straightforward + implementation of bucketed joins in their Spark SQL queries. + +```sql +-- Enable bucketing optimization +SET spark.sql.sources.v2.bucketing.enabled=true; + +-- Bucketed Join +SELECT * FROM t1 JOIN t2 on t1.id = t2.id +``` + +As long as both tables involved in the join are bucketed tables (whether they are primary key tables or non-primary key +tables) and the bucket-key field corresponds to the join field, an efficient Bucketed Join will take place. This +optimization minimizes data shuffling and enhances query performance, making it a valuable feature for handling large +datasets in Paimon. + +### Writing to Dynamic Bucket Tables + +- Optimized Writing: Spark SQL can now write to dynamic bucket tables with optimizations that reduce data shuffling + during the first write operation. +- Cross-Partition Updates: Spark SQL now supports writing to tables with cross-partition updates, although the overall + efficiency for these operations is still not optimal. + +## Other Progress + +- Paimon Web UI: The Web UI is being released; feel free to try it out! +- Paimon Python: The release process for version 0.1 is expected to start soon. +- Paimon Rust: In development, with an expected release of a readable version in 0.1.