adarshsanjeev commented on code in PR #15689: URL: https://github.com/apache/druid/pull/15689#discussion_r1475840275
########## docs/multi-stage-query/reference.md: ########## @@ -90,6 +93,66 @@ can precede the column list: `EXTEND (timestamp VARCHAR...)`. For more information, see [Read external data with EXTERN](concepts.md#read-external-data-with-extern). +#### `EXTERN` to export to a destination + +`EXTERN` can be used to specify a destination, where the data needs to be exported. +This variation of EXTERN requires one argument, the details of the destination as specified below. +This variation additionally requires an `AS` clause to specify the format of the exported rows. + +INSERT statements and REPLACE statements are both supported with an `EXTERN` destination. +Only `CSV` format is supported at the moment. +Please note that partitioning (`PARTITIONED BY`) and clustering (`CLUSTERED BY`) is not currently supported with export statements. + +Export statements support the context parameter `rowsPerPage` for the number of rows in each exported file. The default value +is 100,000. + +INSERT statements append the results to the existing files at the destination. +```sql +INSERT INTO + EXTERN(<destination function>) +AS CSV +SELECT + <column> +FROM <table> +``` + +REPLACE statements have an additional OVERWRITE clause. As partitioning is not yet supported, only `OVERWRITE ALL` +is allowed. REPLACE deletes any currently existing files at the specified directory, and creates new files with the results of the query. + + +```sql +REPLACE INTO + EXTERN(<destination function>) +AS CSV +OVERWRITE ALL +SELECT + <column> +FROM <table> +``` + +Exporting is currently supported for Amazon S3 storage. This can be done passing the function `S3()` as an argument to the `EXTERN` function. The `druid-s3-extensions` should be loaded. + +```sql +INSERT INTO + EXTERN(S3(bucket=<...>, prefix=<...>, tempDir=<...>)) Review Comment: Changed the example here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
