ozankabak commented on code in PR #9070: URL: https://github.com/apache/arrow-datafusion/pull/9070#discussion_r1471226255
########## docs/source/user-guide/sql/ddl.md: ########## @@ -147,6 +147,16 @@ WITH HEADER ROW LOCATION '/path/to/directory/of/files'; ``` +With `CREATE UNBOUNDED EXTERNAL TABLE` SQL statement. We can create unbounded data sources such as following: + +```sql +CREATE UNBOUNDED EXTERNAL TABLE taxi +STORED AS PARQUET +LOCATION '/mnt/nyctaxi/tripdata.parquet'; +``` + +Datafusion tries to execute queries that refer to unbounded sources in streaming fashion. If this is not possible according to query specifications, datafusion plan generation fails stating it is not possible to execute given query in streaming fashion. Please note that queries that can be executed with unbounded sources (e.g. in streaming mode) are a subset of the bounded sources. A query that fail with unbounded source may work in bounded source. Review Comment: ```suggestion Note that this statement actually reads data from a fixed-size file, so a better example would involve reading from a FIFO file. Nevertheless, once Datafusion sees the `UNBOUNDED` keyword in a data source, it tries to execute queries that refer to this unbounded source in streaming fashion. If this is not possible according to query specifications, plan generation fails stating it is not possible to execute given query in streaming fashion. Note that queries that can run with unbounded sources (i.e. in streaming mode) are a subset of those that can with bounded sources. A query that fails with unbounded source(s) may work with bounded source(s). ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
