mustafasrepo opened a new pull request, #6526:
URL: https://github.com/apache/arrow-datafusion/pull/6526

   # Which issue does this PR close?
   
   Closes #
   
   # Rationale for this change
   
   This PR adds the support for the following SQL queries:
   
   ```sql
   CREATE EXTERNAL TABLE source_table (
       a1  VARCHAR NOT NULL,
       a2  INT NOT NULL
   )
   STORED AS CSV
   WITH HEADER ROW
   OPTIONS ('UNBOUNDED' 'TRUE')
   LOCATION '{source}';
   
   CREATE EXTERNAL TABLE sink_table (
       a1  VARCHAR NOT NULL,
       a2  INT NOT NULL
   )
   STORED AS CSV
   WITH HEADER ROW
   OPTIONS ('UNBOUNDED' 'TRUE')
   LOCATION '{sink}';
   
   INSERT INTO sink_table
   SELECT a1, a2 FROM source_table;
   ```
   
   This PR adds support for appending data to external tables, which previously 
only supported memory tables. It introduces new structs and modifications to 
existing structs, enabling users to efficiently work with file-based storage 
systems when appending data.
   
   # What changes are included in this PR?
   
   - Added `FileSinkConfig` struct for base configurations when creating a 
physical plan for any given file format.
   - Added `FileWriterExt` and to handle writing record batches to a file-like 
output.
   - Added `CsvSink` struct for which implements `DataSink`  to write results 
to CSV file.
   
   # Are these changes tested?
   
   Yes
   
   # Are there any user-facing changes?
   
   This change allows users to append data to external tables, which was not 
possible before. Users can now work with file-based storage systems more 
efficiently, especially when appending data.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to