geoffreyclaude opened a new pull request, #19383:
URL: https://github.com/apache/datafusion/pull/19383

   ## Which issue does this PR close?
   
   Related to #16756
   
   ## Rationale for this change
   
   The existing `sql_dialect.rs` example demonstrates `COPY ... STORED AS ...`, 
which is actually already fully supported by the standard `DFParser`.
   
   This PR replaces it with the example from #16756: `CREATE EXTERNAL CATALOG 
... STORED AS ... LOCATION ...` with automatic table discovery.
   
   ## What changes are included in this PR?
   
   The first commit updates `dialect.rs` to show that `DFParser` already 
handles `COPY ... STORED AS`, making it clear this syntax doesn't need 
customization.
   
   Example output from `cargo run --example sql_ops -- dialect`:
   
   ```
   Query: COPY source_table TO 'file.fasta' STORED AS FASTA
   --- Parsing without extension ---
   Standard DFParser: Parsed as Statement::CopyTo: COPY source_table TO 
file.fasta STORED AS FASTA
   
   --- Parsing with extension ---
   Custom MyParser: Parsed as MyStatement::MyCopyTo: COPY source_table TO 
'file.fasta' STORED AS FASTA
   ```
   
   The second commit adds a new `custom_sql_parser.rs` example that implements 
`CREATE EXTERNAL CATALOG my_catalog STORED AS <format> LOCATION '<url>'` with 
automatic table discovery from object storage. It also removes the old 
`dialect.rs` example.
   
   ## Are these changes tested?
   
   Yes, the new example is runnable with `cargo run --example sql_ops -- 
custom_sql_parser` and demonstrates the full flow from parsing custom DDL 
through registering the catalog to querying discovered tables.
   
   Example output:
   
   ```
   === Part 1: Standard DataFusion Parser ===
   
   Parsing: CREATE EXTERNAL CATALOG parquet_testing
            STORED AS parquet
            LOCATION 'local://workspace/parquet-testing/data'
            OPTIONS (
              'schema_name' = 'staged_data',
              'format.pruning' = 'true'
            )
   
   Error: SQL error: ParserError("Expected: TABLE, found: CATALOG at Line: 1, 
Column: 17")
   
   === Part 2: Custom Parser ===
   
   Parsing: CREATE EXTERNAL CATALOG parquet_testing
            STORED AS parquet
            LOCATION 'local://workspace/parquet-testing/data'
            OPTIONS (
              'schema_name' = 'staged_data',
              'format.pruning' = 'true'
            )
   
     Target Catalog: parquet_testing
     Data Location: local://workspace/parquet-testing/data
     Resolved Schema: staged_data
     Registered 69 tables into schema: staged_data
   Executing: SELECT id, bool_col, tinyint_col FROM 
parquet_testing.staged_data.alltypes_plain LIMIT 5
   
   +----+----------+-------------+
   | id | bool_col | tinyint_col |
   +----+----------+-------------+
   | 4  | true     | 0           |
   | 5  | false    | 1           |
   | 6  | true     | 0           |
   | 7  | false    | 1           |
   | 2  | true     | 0           |
   +----+----------+-------------+
   ```
   
   ## Are there any user-facing changes?
   
   Documentation only. I replaced the `sql_dialect.rs` example with 
`custom_sql_parser.rs` and updated the README. No API changes.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to