devinjdangelo opened a new pull request, #7294:
URL: https://github.com/apache/arrow-datafusion/pull/7294

   Draft until #7283 is merged since this builds on it
   
   ## Which issue does this PR close?
   
   Closes #7228
   (^ this issue was closed prematurely with #7276 by accident, but this one 
should close it for real)
   
   ## Rationale for this change
   
   Existing Sqllogic tests for insert into only cover memory tables. Expanding 
this to cover external tables will improve our test coverage and validation of 
changes to FileSinks.
   
   ## What changes are included in this PR?
   
   - Adds `insert_to_external.slt` which replicates many of the tests in 
`insert.slt` but for external ListingTables
   - Refactor the logic for creating local file paths if not existing in `Copy 
To` physical plan to a new method 
`ListingTableUrl::parse_create_local_if_not_exists`
   - Add new option for external table creation to allow creating local path
   
   ## Examples:
   
   ### Create a single csv file and append values
   ```sql
   statement ok
   CREATE EXTERNAL TABLE
   single_file_test(a bigint, b bigint)
   STORED AS csv
   LOCATION 'test_files/scratch/single_csv_table.csv'
   OPTIONS(
   create_local_path 'true',
   single_file 'true',
   );
   
   query II
   INSERT INTO single_file_test values (1, 2), (3, 4);
   ----
   2
   
   query II
   select * from single_file_test;
   ----
   1 2
   3 4
   ```
   
   ### Create a table backed by a directory of parquet files, insert new files
   ```sql
   statement ok
   CREATE EXTERNAL TABLE
   directory_test(a bigint, b bigint)
   STORED AS parquet
   LOCATION 'test_files/scratch/external_parquet_table_q0'
   OPTIONS(
   create_local_path 'true',
   );
   
   query II
   INSERT INTO directory_test values (1, 2), (3, 4);
   ----
   2
   
   query II
   select * from directory_test;
   ----
   1 2
   3 4
   ```
   
   ## Are these changes tested?
   
   Yes
   
   ## Are there any user-facing changes?
   
   More options supported!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to