(arrow-datafusion) branch main updated: Update `COPY` documentation to reflect cahnges (#9754)

alamb Fri, 29 Mar 2024 06:41:02 -0700

This is an automated email from the ASF dual-hosted git repository.

alamb pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git



The following commit(s) were added to refs/heads/main by this push:
     new 2956ec2962 Update `COPY` documentation to reflect cahnges (#9754)
2956ec2962 is described below

commit 2956ec2962d7af94be53243427f8795d29fa90a3
Author: Andrew Lamb <[email protected]>
AuthorDate: Fri Mar 29 09:39:27 2024 -0400

    Update `COPY` documentation to reflect cahnges (#9754)
---
 docs/source/user-guide/sql/dml.md | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/docs/source/user-guide/sql/dml.md 
b/docs/source/user-guide/sql/dml.md
index b9614bb8f9..79c36092fd 100644
--- a/docs/source/user-guide/sql/dml.md
+++ b/docs/source/user-guide/sql/dml.md
@@ -25,11 +25,14 @@ and modifying data in tables.
 ## COPY
 
 Copies the contents of a table or query to file(s). Supported file
-formats are `parquet`, `csv`, and `json` and can be inferred based on
-filename if writing to a single file.
+formats are `parquet`, `csv`, `json`, and `arrow`.
 
 <pre>
-COPY { <i><b>table_name</i></b> | <i><b>query</i></b> } TO 
'<i><b>file_name</i></b>' [ ( <i><b>option</i></b> [, ... ] ) ]
+COPY { <i><b>table_name</i></b> | <i><b>query</i></b> } 
+TO '<i><b>file_name</i></b>'
+[ STORED AS <i><b>format</i></b> ]
+[ PARTITIONED BY <i><b>column_name</i></b> [, ...] ]
+[ OPTIONS( <i><b>option</i></b> [, ... ] ) ]
 </pre>
 
 For a detailed list of valid OPTIONS, see [Write Options](write_options).
@@ -61,7 +64,7 @@ Copy the contents of `source_table` to multiple directories
 of hive-style partitioned parquet files:
 
 ```sql
-> COPY source_table TO 'dir_name' (FORMAT parquet, partition_by 'column1, 
column2');
+> COPY source_table TO 'dir_name' STORED AS parquet, PARTITIONED BY (column1, 
column2);
 +-------+
 | count |
 +-------+
@@ -74,7 +77,7 @@ results (maintaining the order) to a parquet file named
 `output.parquet` with a maximum parquet row group size of 10MB:
 
 ```sql
-> COPY (SELECT * from source ORDER BY time) TO 'output.parquet' 
(ROW_GROUP_LIMIT_BYTES 10000000);
+> COPY (SELECT * from source ORDER BY time) TO 'output.parquet' OPTIONS 
(MAX_ROW_GROUP_SIZE 10000000);
 +-------+
 | count |
 +-------+
@@ -82,6 +85,12 @@ results (maintaining the order) to a parquet file named
 +-------+
 ```
 
+The output format is determined by the first match of the following rules:
+
+1. Value of `STORED AS`
+2. Value of the `OPTION (FORMAT ..)`
+3. Filename extension (e.g. `foo.parquet` implies `PARQUET` format)
+
 ## INSERT
 
 Insert values into a table.

(arrow-datafusion) branch main updated: Update `COPY` documentation to reflect cahnges (#9754)

Reply via email to