This is an automated email from the ASF dual-hosted git repository. lzljs3620320 pushed a commit to branch release-0.6 in repository https://gitbox.apache.org/repos/asf/incubator-paimon.git
commit d8faef64973f774e629189f35b152f9c47359bd2 Author: Jingsong <[email protected]> AuthorDate: Wed Dec 6 22:22:29 2023 +0800 [doc] Recommend to use 200MB bucket size --- docs/content/concepts/basic-concepts.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/content/concepts/basic-concepts.md b/docs/content/concepts/basic-concepts.md index 49842be21..7935b1fa2 100644 --- a/docs/content/concepts/basic-concepts.md +++ b/docs/content/concepts/basic-concepts.md @@ -50,7 +50,7 @@ Unpartitioned tables, or partitions in partitioned tables, are sub-divided into The range for a bucket is determined by the hash value of one or more columns in the records. Users can specify bucketing columns by providing the [`bucket-key` option]({{< ref "maintenance/configurations#coreoptions" >}}). If no `bucket-key` option is specified, the primary key (if defined) or the complete record will be used as the bucket key. -A bucket is the smallest storage unit for reads and writes, so the number of buckets limits the maximum processing parallelism. This number should not be too big, though, as it will result in lots of small files and low read performance. In general, the recommended data size in each bucket is about 1GB. +A bucket is the smallest storage unit for reads and writes, so the number of buckets limits the maximum processing parallelism. This number should not be too big, though, as it will result in lots of small files and low read performance. In general, the recommended data size in each bucket is about 200MB - 1GB. See [file layouts]({{< ref "concepts/file-layouts" >}}) for how files are divided into buckets. Also, see [rescale bucket]({{< ref "maintenance/rescale-bucket" >}}) if you want to adjust the number of buckets after a table is created.
