marvelshan commented on code in PR #15708:
URL: https://github.com/apache/datafusion/pull/15708#discussion_r2049931675


##########
docs/source/user-guide/sql/format_options.md:
##########
@@ -0,0 +1,209 @@
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+# Format Options
+
+DataFusion supports customizing how data is read from or written to disk as a 
result of a `COPY`, `INSERT INTO`, or `CREATE EXTERNAL TABLE` statements. There 
are a few special options, file format (e.g., CSV or Parquet) specific options, 
and Parquet column-specific options. In some cases, Options can be specified in 
multiple ways with a set order of precedence.
+
+## Specifying Options and Order of Precedence
+
+Format-related options can be specified in three ways, in decreasing order of 
precedence:
+
+- `CREATE EXTERNAL TABLE` syntax
+- `COPY` option tuples
+- Session-level config defaults
+
+For a list of supported session-level config defaults, see [Configuration 
Settings](../configs). These defaults apply to all operations but have the 
lowest level of precedence.
+
+If creating an external table, table-specific format options can be specified 
when the table is created using the `OPTIONS` clause:
+
+```sql
+CREATE EXTERNAL TABLE
+  my_table(a bigint, b bigint)
+  STORED AS csv
+  LOCATION '/test/location/my_csv_table/'
+  OPTIONS(
+    NULL_VALUE 'NAN',
+    'has_header' 'true',
+    'format.delimiter' ';'
+  );
+```
+
+When running `INSERT INTO my_table ...`, the options from the `CREATE TABLE` 
will be respected (e.g., gzip compression, special delimiter, and header row 
included). Note that compression, header, and delimiter settings can also be 
specified within the `OPTIONS` tuple list. Dedicated syntax within the SQL 
statement always takes precedence over arbitrary option tuples, so if both are 
specified, the `OPTIONS` setting will be ignored.
+
+Finally, options can be passed when running a `COPY` command.
+
+```sql
+COPY source_table
+  TO 'test/table_with_options'
+  PARTITIONED BY (column3, column4)
+  OPTIONS (
+    format parquet,
+    compression snappy,
+    'compression::column1' 'zstd(5)',
+  )
+```
+
+In this example, we write the entirety of `source_table` out to a folder of 
Parquet files. One Parquet file will be written in parallel to the folder for 
each partition in the query. The next option `compression` set to `snappy` 
indicates that unless otherwise specified, all columns should use the snappy 
compression codec. The option `compression::col1` sets an override, so that the 
column `col1` in the Parquet file will use the ZSTD compression codec with 
compression level `5`. In general, Parquet options that support column-specific 
settings can be specified with the syntax `OPTION::COLUMN.NESTED.PATH`.
+
+# Available Options
+
+## Execution-Specific Options
+
+The following options are available when executing a `COPY` query.
+
+| Option Key                                                  | Description    
                                                                                
              | Default Value |

Review Comment:
   Thank you very much for your review! I truly appreciate you taking the time 
to make the final adjustments. I’ve learned a lot of valuable details from this 
process. If there’s anything else that needs further revision, please don’t 
hesitate to let me know—I’d be more than happy to make the necessary changes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to