[I] COPY fails on cli with Invalid statement [arrow-datafusion]

via GitHub Wed, 03 Apr 2024 11:21:35 -0700


hveiga opened a new issue, #9927:
URL: https://github.com/apache/arrow-datafusion/issues/9927


   ### Describe the bug
   
   I downloaded 
https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-37.0.0-rc2/
 to test the new `partition_by` feature. I built `datafusion-cli` by running 
`cargo build --release` under `datafusion-cli`.
   
   The use case is simple: load a parquet file and create multiple parquet 
files using hive-partitioned partitions.
   When I try to run the documented `COPY` command on 
https://arrow.apache.org/datafusion/user-guide/sql/write_options.html I get an 
error.
   
   ### To Reproduce
   
   1. Build `datafusion-cli` from 
https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-37.0.0-rc2/.
   2. Run `./datafusion-cli`.
   3. Create a table from a parquet file:
   ```
   CREATE EXTERNAL TABLE t1
   STORED AS PARQUET
   LOCATION '/tmp/file.parquet';
   ```
   4. Execute partition_by command:
   
   ```
   COPY t1 TO '/tmp/hive_output/' (format parquet, partition_by 'col1');
   ```
   5. Get an error: `🤔 Invalid statement: sql parser error: Unexpected token (`
   
   ### Expected behavior
   
   Have the `COPY` statement generate the expected hive-partitioned parquet 
files.
   
   ### Additional context
   
   I don't know if I might be having an issue with my SQL statements or the 
`COPY` documentation is incorrect. Still, I thought it was good to report 
before `37.0.0` gets released. 
https://github.com/apache/arrow-datafusion/issues/9682
   
   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] COPY fails on cli with Invalid statement [arrow-datafusion]

Reply via email to