This is an automated email from the ASF dual-hosted git repository.
lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-paimon-website.git
The following commit(s) were added to refs/heads/master by this push:
new e2fa4fb5e4 Add 0.7 release notes (#5)
e2fa4fb5e4 is described below
commit e2fa4fb5e4c1ea829e3e14b5471859f7df61ee47
Author: yuzelin <[email protected]>
AuthorDate: Fri Mar 1 10:53:37 2024 +0800
Add 0.7 release notes (#5)
---
pages/content/releases/release-0.7.md | 102 ++++++++++++++++++++++++++++++++++
1 file changed, 102 insertions(+)
diff --git a/pages/content/releases/release-0.7.md
b/pages/content/releases/release-0.7.md
new file mode 100644
index 0000000000..6c5956ff23
--- /dev/null
+++ b/pages/content/releases/release-0.7.md
@@ -0,0 +1,102 @@
+---
+title: "Release 0.7"
+weight: 997
+type: docs
+aliases:
+- /release-0.7.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Apache Paimon 0.7 Available
+
+February 29, 2024 - Paimon Community ([email protected])
+
+Apache Paimon PPMC has officially released Apache Paimon 0.7.0-incubating
version. A total of 34 people contributed to
+this version and completed over 300 Commits. Thank you to all contributors for
their support!
+
+In this version, we mainly focus on enhancement and optimization for the
existing features. For more details, please
+refer to the following content below.
+
+## Flink
+
+### Lookup Join
+
+1. Fix bug that lookup join cannot handle sequence field of dim table.
+2. Introduced primary key partial lookup based on Paimon hash lookup for
lookup join.
+3. Use parallel reading and bulk loading to speed up initial data loading of
dim table.
+
+### Paimon CDC
+
+In 0.7, we continued to improve the Paimon CDC:
+
+1. Support Postgres table synchronization.
+2. Paimon data commit relies on checkpoint but many users forget to enable it
when submitting CDC job. In this case, the job
+will set checkpoint interval to 180 seconds.
+3. Support UNAWARE Bucket mode table as CDC sink.
+4. Support extract time attribute from CDC input as watermark. Now, you can
set `tag.automatic-creation` to `watermark` in CDC jobs.
+
+## Spark
+
+1. Merge-into supports WHEN NOT MATCHED BY SOURCE semantics.
+2. Sort compact supports Hilbert Curve sorter.
+3. Multiple optimization for improving query performance.
+
+## Hive
+
+In 0.7, we mainly focus on improving compatibility with Hive.
+
+1. Support timestamp with local time zone type.
+2. Support create database with location, comment and properties of HiveSQL.
+3. Support table comment.
+
+## Tag Management
+
+1. Support a new `tag.automatic-creation` mode `batch`. In this mode, a tag
will be created after a batch job completed.
+2. The tag auto creation relies on commit, so if no commit when it is time to
auto-create tag, the tag won't be created.
+In this case, we introduce an option `snapshot.watermark-idle-timeout`. If the
flink source idles over the specified
+timeout duration, the job will force to create a snapshot and thus trigger tag
creation.
+
+## New Aggregation Functions
+
+1. count: counts the values across multiple rows.
+2. product: computes product values across multiple rows.
+3. nested-update: collects multiple rows into one ARRAY<ROW> (so-called
'nested table'). You can use `fields.<field-name>.nested-key=pk0,pk1,...` to
+define the primary keys of the nested table. If no keys defined, the rows will
be appended to the array.
+4. collect: collects elements into an ARRAY. You can set
`fields.<field-name>.distinct=true` to deduplicate elements.
+5. merge_map: merges input maps into single map.
+
+## New Metrics
+
+1. Support Flink standard connector metric `currentEmitEventTimeLag`.
+2. Support `level0FileCount` to show the compaction progress.
+
+## Other Improvements
+
+Besides above, there are some useful improvements for existed features:
+
+1. New time travel option `scan.file-creation-time-millis`: By specifying this
option, only the data files created after
+this time will be read. It is more convenient than `scan.timestamp-millis` and
`scan.tag-name`, but is imprecise (depending
+on whether compaction occurs).
+2. For primary key table, now the row kind can be determined by field which is
specified by option `rowkind.field`.
+3. Support ignoring delete records in deduplicate mode by option
`deduplicate.ignore-delete`.
+4. Support ignoring consumer id when starting streaming reading job by option
`consumer.ignore-progress`.
+5. Support new procedure `expire_snapshots` to manually trigger snapshot
expiration.
+6. Support new system table `aggregation_fields` to show the aggregation
fields information for aggregate or partial-update table.
+7. Introduce bloom filter to speed up the local file lookup, which can benefit
both lookup changelog-producer and flink lookup join.