kmitchener commented on code in PR #3005: URL: https://github.com/apache/arrow-datafusion/pull/3005#discussion_r936819921
########## docs/source/user-guide/sql/ddl.md: ########## @@ -30,9 +90,43 @@ STORED AS PARQUET LOCATION '/mnt/nyctaxi/tripdata.parquet'; ``` -CSV data sources can also be registered by executing a `CREATE EXTERNAL TABLE` SQL statement. It is necessary to -provide schema information for CSV files since DataFusion does not automatically infer the schema when using SQL -to query CSV files. +```sql +CREATE EXTERNAL TABLE test + STORED AS CSV + WITH HEADER ROW + LOCATION 'c:/tmp/test.csv'; +``` + +Create an external table with partitioned CSV files + +```sql +CREATE EXTERNAL TABLE p_test + STORED AS CSV + WITH HEADER ROW + PARTITIONED BY (year) + LOCATION 'c:/tmp/data'; +``` + +The above statement looks for CSV files in the `c:/tmp/data` directory and creates a table with +the columns and data types inferred, as well as adding a column for the partition: + +TODO: describe rules for inference. which files does it look at, how many rows? is it configurable? Review Comment: Thanks, updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
