This is an automated email from the ASF dual-hosted git repository.
lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-paimon.git
The following commit(s) were added to refs/heads/master by this push:
new 58cd5a583 [doc] Adjust experimental features and fix mongo cdc doc
58cd5a583 is described below
commit 58cd5a58306cd00b07eff56875206b0e49e38269
Author: Jingsong <[email protected]>
AuthorDate: Thu Aug 17 15:38:04 2023 +0800
[doc] Adjust experimental features and fix mongo cdc doc
---
docs/content/concepts/primary-key-table.md | 16 ++++++++--------
docs/content/how-to/cdc-ingestion.md | 4 ----
2 files changed, 8 insertions(+), 12 deletions(-)
diff --git a/docs/content/concepts/primary-key-table.md
b/docs/content/concepts/primary-key-table.md
index 5f1e77aa9..c3986c947 100644
--- a/docs/content/concepts/primary-key-table.md
+++ b/docs/content/concepts/primary-key-table.md
@@ -44,10 +44,6 @@ small files, and a too small number of buckets leads to poor
write performance.
### Dynamic Bucket
-{{< hint info >}}
-This is an experimental feature.
-{{< /hint >}}
-
Configure `'bucket' = '-1'`, Paimon dynamically maintains the index, automatic
expansion of the number of buckets.
- Option1: `'dynamic-bucket.target-row-num'`: controls the target row number
for one bucket.
@@ -65,6 +61,10 @@ Bucket mode uses HASH index to maintain mapping from key to
bucket, it requires
**Cross Partitions Update Dynamic Bucket Mode**:
+{{< hint info >}}
+This is an experimental feature.
+{{< /hint >}}
+
When you need cross partition updates (primary keys not contain all partition
fields), Dynamic Bucket mode directly
maintains the mapping of keys to partition and bucket, uses local disks, and
initializes indexes by reading all
existing keys in the table when starting stream write job. Different merge
engines have different behaviors:
@@ -241,6 +241,10 @@ For streaming queries, `aggregation` merge engine must be
used together with `lo
### First Row
+{{< hint info >}}
+This is an experimental feature.
+{{< /hint >}}
+
By specifying `'merge-engine' = 'first-row'`, users can keep the first row of
the same primary key. It differs from the
`deduplicate` merge engine that in the `first-row` merge engine, it will
generate insert only changelog.
@@ -282,10 +286,6 @@ By specifying `'changelog-producer' = 'input'`, Paimon
writers rely on their inp
### Lookup
-{{< hint info >}}
-This is an experimental feature.
-{{< /hint >}}
-
If your input can’t produce a complete changelog but you still want to get rid
of the costly normalized operator, you may consider using the `'lookup'`
changelog producer.
By specifying `'changelog-producer' = 'lookup'`, Paimon will generate
changelog through `'lookup'` before committing the data writing.
diff --git a/docs/content/how-to/cdc-ingestion.md
b/docs/content/how-to/cdc-ingestion.md
index 7675ad697..84b0203a1 100644
--- a/docs/content/how-to/cdc-ingestion.md
+++ b/docs/content/how-to/cdc-ingestion.md
@@ -462,8 +462,6 @@ Declaring other columns as primary keys is not feasible, as
delete operations on
3. MongoDB Change Streams are designed to return simple JSON documents without
any data type definitions. This is because MongoDB is a document-oriented
database, and one of its core features is the dynamic schema, where documents
can contain different fields, and the data types of fields can be flexible.
Therefore, the absence of data type definitions in Change Streams is to
maintain this flexibility and extensibility.
For this reason, we have set all field data types for synchronizing MongoDB to
Paimon as String to address the issue of not being able to obtain data types.
-{{< generated/mongodb_sync_table >}}
-
If the Paimon table you specify does not exist, this action will automatically
create the table. Its schema will be derived from MongoDB collection.
Example 1: synchronize collection into one Paimon table
@@ -534,8 +532,6 @@ To use this feature through `flink run`, run the following
shell command.
[--table-conf <paimon-table-sink-conf> [--table-conf
<paimon-table-sink-conf> ...]]
```
-{{< generated/mongodb_sync_database >}}
-
All collections to be synchronized need to set _id as the primary key.
For each MongoDB collection to be synchronized, if the corresponding Paimon
table does not exist, this action will automatically create the table.
Its schema will be derived from all specified MongoDB collection. If the
Paimon table already exists, its schema will be compared against the schema of
all specified MongoDB collection.