goldmedal commented on code in PR #97:
URL: https://github.com/apache/datafusion-site/pull/97#discussion_r2236086680


##########
content/blog/2025-07-27-extending-sql-with-satafusion.md:
##########
@@ -0,0 +1,248 @@
+---
+layout: post
+title: Implementing your own SQL dialect and SQL statements with DataFusion
+date: 2025-07-26
+author: Aditya Singh Rathore
+categories: [tutorial]
+---
+
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+Have you ever wished you could extend SQL with custom statements tailored to 
your specific use case? Maybe you're working with a parquet-files-on-S3 storage 
approach and need an `ATTACH DATABASE` statement, or perhaps you want to 
implement catalog management features similar to [DuckDB] or [SQLite]. With 
[Apache DataFusion], you can do exactly that – and it's more straightforward 
than you might think.
+
+[DuckDB]: https://duckdb.org/
+[SQLite]: https://www.sqlite.org/
+[Apache DataFusion]: https://datafusion.apache.org/
+
+## The Challenge: Beyond Standard SQL
+
+Imagine you're building a data platform that uses a parquet-files-on-S3 
storage pattern. Your application needs to dynamically discover and attach 
databases, similar to how [DuckDB] handles multiple databases with its `ATTACH` 
statement. While [DataFusion] supports `CREATE EXTERNAL TABLE`, you need 
something more flexible – perhaps a statement like:
+
+```sql
+CREATE EXTERNAL CATALOG my_catalog
+STORED AS PARQUET
+LOCATION 's3://my-bucket/data/'
+OPTIONS (
+  'aws.region' = 'us-west-2',
+  'catalog.type' = 'hive_metastore'
+);
+```
+
+Standard SQL doesn't have this capability, but DataFusion's extensible 
architecture makes it possible to add custom SQL statements like this.
+
+## Understanding the SQL Processing Pipeline
+
+Before diving into custom implementations, let's understand how DataFusion 
processes SQL queries. The journey from SQL text to execution follows this path:
+
+```text
++-------+      +--------+      +-----+      +-------------+      
+--------------+      +----------+
+| Query | ---> | Parser | ---> | AST | ---> |Logical Plan | ---> |Physical 
Plan | ---> |Execution |
++-------+      +--------+      +-----+      +-------------+      
+--------------+      +----------+
+
+```
+
+1. **SQL Text:** The raw SQL string you write
+2. **Parser:** DataFusion's `DFParser` converts SQL text into an Abstract 
Syntax Tree (AST)
+3. **AST:** A structured representation of the SQL statement using 
DataFusion's `Statement` enum

Review Comment:
   ```suggestion
   2. **AST:** DataFusion's `DFParser` converts SQL text into an Abstract 
Syntax Tree (AST), a structured representation of the SQL statement using 
DataFusion's `Statement` enum
   ```
   As I mentioned above, we can merge the point 2 and 3 to explain how 
DataFusion parse a SQL text to AST.



##########
content/blog/2025-07-27-extending-sql-with-satafusion.md:
##########
@@ -0,0 +1,248 @@
+---
+layout: post
+title: Implementing your own SQL dialect and SQL statements with DataFusion
+date: 2025-07-26
+author: Aditya Singh Rathore
+categories: [tutorial]
+---
+
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+Have you ever wished you could extend SQL with custom statements tailored to 
your specific use case? Maybe you're working with a parquet-files-on-S3 storage 
approach and need an `ATTACH DATABASE` statement, or perhaps you want to 
implement catalog management features similar to [DuckDB] or [SQLite]. With 
[Apache DataFusion], you can do exactly that – and it's more straightforward 
than you might think.
+
+[DuckDB]: https://duckdb.org/
+[SQLite]: https://www.sqlite.org/
+[Apache DataFusion]: https://datafusion.apache.org/
+
+## The Challenge: Beyond Standard SQL
+
+Imagine you're building a data platform that uses a parquet-files-on-S3 
storage pattern. Your application needs to dynamically discover and attach 
databases, similar to how [DuckDB] handles multiple databases with its `ATTACH` 
statement. While [DataFusion] supports `CREATE EXTERNAL TABLE`, you need 
something more flexible – perhaps a statement like:
+
+```sql
+CREATE EXTERNAL CATALOG my_catalog
+STORED AS PARQUET
+LOCATION 's3://my-bucket/data/'
+OPTIONS (
+  'aws.region' = 'us-west-2',
+  'catalog.type' = 'hive_metastore'
+);
+```
+
+Standard SQL doesn't have this capability, but DataFusion's extensible 
architecture makes it possible to add custom SQL statements like this.
+
+## Understanding the SQL Processing Pipeline
+
+Before diving into custom implementations, let's understand how DataFusion 
processes SQL queries. The journey from SQL text to execution follows this path:
+
+```text
++-------+      +--------+      +-----+      +-------------+      
+--------------+      +----------+
+| Query | ---> | Parser | ---> | AST | ---> |Logical Plan | ---> |Physical 
Plan | ---> |Execution |
++-------+      +--------+      +-----+      +-------------+      
+--------------+      +----------+
+

Review Comment:
   ```suggestion
   +----------+     +-----+      +--------------+      +---------------+      
+-----------+
   | SQL Text |---> | AST | ---> | Logical Plan | ---> | Physical Plan | ---> | 
Execution |
   +----------+     +-----+      +--------------+      +---------------+      
+-----------+
   ```
   I think this pipeline present each status of a SQL text change but the 
parser isn't a status. I prefer to remove `Parser` from them.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to