This is an automated email from the ASF dual-hosted git repository.
lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-paimon.git
The following commit(s) were added to refs/heads/master by this push:
new 6e1b9d80f [doc] Explain Fixed Bucket and Dynamic Bucket
6e1b9d80f is described below
commit 6e1b9d80f1161099783d6e27471656d50cb60741
Author: Jingsong <[email protected]>
AuthorDate: Wed Nov 22 12:22:12 2023 +0800
[doc] Explain Fixed Bucket and Dynamic Bucket
---
docs/content/concepts/primary-key-table.md | 14 ++++++++++----
.../main/java/org/apache/paimon/flink/sink/FlinkSink.java | 2 +-
2 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/docs/content/concepts/primary-key-table.md
b/docs/content/concepts/primary-key-table.md
index 576f2b599..0baffd055 100644
--- a/docs/content/concepts/primary-key-table.md
+++ b/docs/content/concepts/primary-key-table.md
@@ -38,13 +38,19 @@ A bucket is the smallest storage unit for reads and writes,
each bucket director
### Fixed Bucket
-Configure a bucket greater than 0, rescaling buckets can only be done through
offline processes,
-see [Rescale Bucket]({{< ref "/maintenance/rescale-bucket" >}}). A too large
number of buckets leads to too many
-small files, and a too small number of buckets leads to poor write performance.
+Configure a bucket greater than 0, using Fixed Bucket mode, according to
`Math.abs(key_hashcode % numBuckets)` to compute
+the bucket of record.
+
+Rescaling buckets can only be done through offline processes, see [Rescale
Bucket]({{< ref "/maintenance/rescale-bucket" >}}).
+A too large number of buckets leads to too many small files, and a too small
number of buckets leads to poor write performance.
### Dynamic Bucket
-Configure `'bucket' = '-1'`, Paimon dynamically maintains the index, automatic
expansion of the number of buckets.
+Configure `'bucket' = '-1'`. The keys that arrive first will fall into the old
buckets, and the new keys will fall into
+the new buckets, the distribution of buckets and keys depends on the order in
which the data arrives. Paimon maintains
+an index to determine which key corresponds to which bucket.
+
+Paimon will automatically expand the number of buckets.
- Option1: `'dynamic-bucket.target-row-num'`: controls the target row number
for one bucket.
- Option2: `'dynamic-bucket.assigner-parallelism'`: Parallelism of assigner
operator, controls the number of initialized bucket.
diff --git
a/paimon-flink/paimon-flink-common/src/main/java/org/apache/paimon/flink/sink/FlinkSink.java
b/paimon-flink/paimon-flink-common/src/main/java/org/apache/paimon/flink/sink/FlinkSink.java
index d0bae779d..b305f0c0a 100644
---
a/paimon-flink/paimon-flink-common/src/main/java/org/apache/paimon/flink/sink/FlinkSink.java
+++
b/paimon-flink/paimon-flink-common/src/main/java/org/apache/paimon/flink/sink/FlinkSink.java
@@ -171,7 +171,7 @@ public abstract class FlinkSink<T> implements Serializable {
Options options = Options.fromMap(table.options());
if (options.get(SINK_USE_MANAGED_MEMORY)) {
MemorySize memorySize =
options.get(SINK_MANAGED_WRITER_BUFFER_MEMORY);
- written.getTransformation()
+
written.getTransformation().declareManagedMemoryUseCaseAtSlotScope()
.declareManagedMemoryUseCaseAtOperatorScope(
ManagedMemoryUseCase.OPERATOR,
memorySize.getMebiBytes());
}