(incubator-paimon) branch master updated: [doc] Explain Fixed Bucket and Dynamic Bucket

lzljs3620320 Tue, 21 Nov 2023 20:22:39 -0800

This is an automated email from the ASF dual-hosted git repository.

lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-paimon.git



The following commit(s) were added to refs/heads/master by this push:
     new 6e1b9d80f [doc] Explain Fixed Bucket and Dynamic Bucket
6e1b9d80f is described below

commit 6e1b9d80f1161099783d6e27471656d50cb60741
Author: Jingsong <[email protected]>
AuthorDate: Wed Nov 22 12:22:12 2023 +0800

    [doc] Explain Fixed Bucket and Dynamic Bucket
---
 docs/content/concepts/primary-key-table.md                 | 14 ++++++++++----
 .../main/java/org/apache/paimon/flink/sink/FlinkSink.java  |  2 +-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/docs/content/concepts/primary-key-table.md 
b/docs/content/concepts/primary-key-table.md
index 576f2b599..0baffd055 100644
--- a/docs/content/concepts/primary-key-table.md
+++ b/docs/content/concepts/primary-key-table.md
@@ -38,13 +38,19 @@ A bucket is the smallest storage unit for reads and writes, 
each bucket director
 
 ### Fixed Bucket
 
-Configure a bucket greater than 0, rescaling buckets can only be done through 
offline processes,
-see [Rescale Bucket]({{< ref "/maintenance/rescale-bucket" >}}). A too large 
number of buckets leads to too many
-small files, and a too small number of buckets leads to poor write performance.
+Configure a bucket greater than 0, using Fixed Bucket mode, according to 
`Math.abs(key_hashcode % numBuckets)` to compute
+the bucket of record.
+
+Rescaling buckets can only be done through offline processes, see [Rescale 
Bucket]({{< ref "/maintenance/rescale-bucket" >}}).
+A too large number of buckets leads to too many small files, and a too small 
number of buckets leads to poor write performance.
 
 ### Dynamic Bucket
 
-Configure `'bucket' = '-1'`, Paimon dynamically maintains the index, automatic 
expansion of the number of buckets.
+Configure `'bucket' = '-1'`. The keys that arrive first will fall into the old 
buckets, and the new keys will fall into
+the new buckets, the distribution of buckets and keys depends on the order in 
which the data arrives. Paimon maintains
+an index to determine which key corresponds to which bucket.
+
+Paimon will automatically expand the number of buckets.
 
 - Option1: `'dynamic-bucket.target-row-num'`: controls the target row number 
for one bucket.
 - Option2: `'dynamic-bucket.assigner-parallelism'`: Parallelism of assigner 
operator, controls the number of initialized bucket.
diff --git 
a/paimon-flink/paimon-flink-common/src/main/java/org/apache/paimon/flink/sink/FlinkSink.java
 
b/paimon-flink/paimon-flink-common/src/main/java/org/apache/paimon/flink/sink/FlinkSink.java
index d0bae779d..b305f0c0a 100644
--- 
a/paimon-flink/paimon-flink-common/src/main/java/org/apache/paimon/flink/sink/FlinkSink.java
+++ 
b/paimon-flink/paimon-flink-common/src/main/java/org/apache/paimon/flink/sink/FlinkSink.java
@@ -171,7 +171,7 @@ public abstract class FlinkSink<T> implements Serializable {
         Options options = Options.fromMap(table.options());
         if (options.get(SINK_USE_MANAGED_MEMORY)) {
             MemorySize memorySize = 
options.get(SINK_MANAGED_WRITER_BUFFER_MEMORY);
-            written.getTransformation()
+            
written.getTransformation().declareManagedMemoryUseCaseAtSlotScope()
                     .declareManagedMemoryUseCaseAtOperatorScope(
                             ManagedMemoryUseCase.OPERATOR, 
memorySize.getMebiBytes());
         }

(incubator-paimon) branch master updated: [doc] Explain Fixed Bucket and Dynamic Bucket

Reply via email to