This is an automated email from the ASF dual-hosted git repository.
lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/paimon.git
The following commit(s) were added to refs/heads/master by this push:
new ae5261e9d [doc] Adjust default format to parquet
ae5261e9d is described below
commit ae5261e9dac6542b2bc1800ef6dc3c8673da33ab
Author: Jingsong <[email protected]>
AuthorDate: Wed Jul 3 19:48:22 2024 +0800
[doc] Adjust default format to parquet
---
docs/content/concepts/basic-concepts.md | 2 +-
docs/content/maintenance/write-performance.md | 2 +-
docs/content/project/roadmap.md | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/docs/content/concepts/basic-concepts.md
b/docs/content/concepts/basic-concepts.md
index 3213ca429..8523d29f3 100644
--- a/docs/content/concepts/basic-concepts.md
+++ b/docs/content/concepts/basic-concepts.md
@@ -54,7 +54,7 @@ A manifest file is a file containing changes about LSM data
files and changelog
## Data Files
-Data files are grouped by partitions. Currently, Paimon supports using orc
(default), parquet and avro as data file's format.
+Data files are grouped by partitions. Currently, Paimon supports using parquet
(default), orc and avro as data file's format.
## Partition
diff --git a/docs/content/maintenance/write-performance.md
b/docs/content/maintenance/write-performance.md
index eabd1ce82..329e2d204 100644
--- a/docs/content/maintenance/write-performance.md
+++ b/docs/content/maintenance/write-performance.md
@@ -239,7 +239,7 @@ There are three main places in Paimon writer that takes up
memory:
* Writer's memory buffer, shared and preempted by all writers of a single
task. This memory value can be adjusted by the `write-buffer-size` table
property.
* Memory consumed when merging several sorted runs for compaction. Can be
adjusted by the `num-sorted-run.compaction-trigger` option to change the number
of sorted runs to be merged.
* If the row is very large, reading too many lines of data at once will
consume a lot of memory when making a compaction. Reducing the
`read.batch-size` option can alleviate the impact of this case.
-* The memory consumed by writing columnar (ORC, Parquet, etc.) file.
Decreasing the `orc.write.batch-size` option can reduce the consumption of
memory for ORC format.
+* The memory consumed by writing columnar ORC file. Decreasing the
`orc.write.batch-size` option can reduce the consumption of memory for ORC
format.
* If files are automatically compaction in the write task, dictionaries for
certain large columns can significantly consume memory during compaction.
* To disable dictionary encoding for all fields in Parquet format, set
`'parquet.enable.dictionary'= 'false'`.
* To disable dictionary encoding for all fields in ORC format, set
`orc.dictionary.key.threshold='0'`. Additionally,set
`orc.column.encoding.direct='field1,field2'` to disable dictionary encoding for
specific columns.
diff --git a/docs/content/project/roadmap.md b/docs/content/project/roadmap.md
index b0418decb..2f6b63af0 100644
--- a/docs/content/project/roadmap.md
+++ b/docs/content/project/roadmap.md
@@ -28,7 +28,7 @@ under the License.
## Native Format IO
-Integrate native ORC & Parquet reader & writer.
+Integrate native Parquet & ORC reader & writer.
## Deletion Vectors (Merge On Write)