(paimon) branch master updated: [doc] Adjust default format to parquet

lzljs3620320 Wed, 03 Jul 2024 04:49:44 -0700

This is an automated email from the ASF dual-hosted git repository.

lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/paimon.git



The following commit(s) were added to refs/heads/master by this push:
     new ae5261e9d [doc] Adjust default format to parquet
ae5261e9d is described below

commit ae5261e9dac6542b2bc1800ef6dc3c8673da33ab
Author: Jingsong <[email protected]>
AuthorDate: Wed Jul 3 19:48:22 2024 +0800

    [doc] Adjust default format to parquet
---
 docs/content/concepts/basic-concepts.md       | 2 +-
 docs/content/maintenance/write-performance.md | 2 +-
 docs/content/project/roadmap.md               | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/content/concepts/basic-concepts.md 
b/docs/content/concepts/basic-concepts.md
index 3213ca429..8523d29f3 100644
--- a/docs/content/concepts/basic-concepts.md
+++ b/docs/content/concepts/basic-concepts.md
@@ -54,7 +54,7 @@ A manifest file is a file containing changes about LSM data 
files and changelog
 
 ## Data Files
 
-Data files are grouped by partitions. Currently, Paimon supports using orc 
(default), parquet and avro as data file's format.
+Data files are grouped by partitions. Currently, Paimon supports using parquet 
(default), orc and avro as data file's format.
 
 ## Partition
 
diff --git a/docs/content/maintenance/write-performance.md 
b/docs/content/maintenance/write-performance.md
index eabd1ce82..329e2d204 100644
--- a/docs/content/maintenance/write-performance.md
+++ b/docs/content/maintenance/write-performance.md
@@ -239,7 +239,7 @@ There are three main places in Paimon writer that takes up 
memory:
 * Writer's memory buffer, shared and preempted by all writers of a single 
task. This memory value can be adjusted by the `write-buffer-size` table 
property.
 * Memory consumed when merging several sorted runs for compaction. Can be 
adjusted by the `num-sorted-run.compaction-trigger` option to change the number 
of sorted runs to be merged.
 * If the row is very large, reading too many lines of data at once will 
consume a lot of memory when making a compaction. Reducing the 
`read.batch-size` option can alleviate the impact of this case.
-* The memory consumed by writing columnar (ORC, Parquet, etc.) file. 
Decreasing the `orc.write.batch-size` option can reduce the consumption of 
memory for ORC format.
+* The memory consumed by writing columnar ORC file. Decreasing the 
`orc.write.batch-size` option can reduce the consumption of memory for ORC 
format.
 * If files are automatically compaction in the write task, dictionaries for 
certain large columns can significantly consume memory during compaction.
   * To disable dictionary encoding for all fields in Parquet format, set 
`'parquet.enable.dictionary'= 'false'`.
   * To disable dictionary encoding for all fields in ORC format, set 
`orc.dictionary.key.threshold='0'`. Additionally,set 
`orc.column.encoding.direct='field1,field2'` to disable dictionary encoding for 
specific columns.
diff --git a/docs/content/project/roadmap.md b/docs/content/project/roadmap.md
index b0418decb..2f6b63af0 100644
--- a/docs/content/project/roadmap.md
+++ b/docs/content/project/roadmap.md
@@ -28,7 +28,7 @@ under the License.
 
 ## Native Format IO
 
-Integrate native ORC & Parquet reader & writer.
+Integrate native Parquet & ORC reader & writer.
 
 ## Deletion Vectors (Merge On Write)

(paimon) branch master updated: [doc] Adjust default format to parquet

Reply via email to