vogievetsky commented on code in PR #15689: URL: https://github.com/apache/druid/pull/15689#discussion_r1475327849
########## docs/multi-stage-query/reference.md: ########## @@ -90,6 +93,66 @@ can precede the column list: `EXTEND (timestamp VARCHAR...)`. For more information, see [Read external data with EXTERN](concepts.md#read-external-data-with-extern). +#### `EXTERN` to export to a destination + +`EXTERN` can be used to specify a destination, where the data needs to be exported. +This variation of EXTERN requires one argument, the details of the destination as specified below. +This variation additionally requires an `AS` clause to specify the format of the exported rows. + +INSERT statements and REPLACE statements are both supported with an `EXTERN` destination. +Only `CSV` format is supported at the moment. +Please note that partitioning (`PARTITIONED BY`) and clustering (`CLUSTERED BY`) is not currently supported with export statements. + +Export statements support the context parameter `rowsPerPage` for the number of rows in each exported file. The default value +is 100,000. + +INSERT statements append the results to the existing files at the destination. +```sql +INSERT INTO + EXTERN(<destination function>) +AS CSV +SELECT + <column> +FROM <table> +``` + +REPLACE statements have an additional OVERWRITE clause. As partitioning is not yet supported, only `OVERWRITE ALL` +is allowed. REPLACE deletes any currently existing files at the specified directory, and creates new files with the results of the query. + + +```sql +REPLACE INTO + EXTERN(<destination function>) +AS CSV +OVERWRITE ALL +SELECT + <column> +FROM <table> +``` + +Exporting is currently supported for Amazon S3 storage. This can be done passing the function `S3()` as an argument to the `EXTERN` function. The `druid-s3-extensions` should be loaded. Review Comment: I would like some more information on what this S3 function with the named is. Is it some special case or is it how we are settling on doing functions with named parameters. It it a SQL thing or a Calcite thing or a Druid thing? I have at one point seen functions with names parameters be represented as ``` FN(x="a") FN(x='a') FN(x=>'a') ``` Where are all these variations coming from? Can there be quotes on the keys? Are they `"` or `'`? Can these functions also accept non-named (ordinal) parameters? ########## docs/multi-stage-query/reference.md: ########## @@ -90,6 +93,66 @@ can precede the column list: `EXTEND (timestamp VARCHAR...)`. For more information, see [Read external data with EXTERN](concepts.md#read-external-data-with-extern). +#### `EXTERN` to export to a destination + +`EXTERN` can be used to specify a destination, where the data needs to be exported. +This variation of EXTERN requires one argument, the details of the destination as specified below. +This variation additionally requires an `AS` clause to specify the format of the exported rows. + +INSERT statements and REPLACE statements are both supported with an `EXTERN` destination. +Only `CSV` format is supported at the moment. +Please note that partitioning (`PARTITIONED BY`) and clustering (`CLUSTERED BY`) is not currently supported with export statements. + +Export statements support the context parameter `rowsPerPage` for the number of rows in each exported file. The default value +is 100,000. + +INSERT statements append the results to the existing files at the destination. +```sql +INSERT INTO + EXTERN(<destination function>) +AS CSV +SELECT + <column> +FROM <table> +``` + +REPLACE statements have an additional OVERWRITE clause. As partitioning is not yet supported, only `OVERWRITE ALL` +is allowed. REPLACE deletes any currently existing files at the specified directory, and creates new files with the results of the query. + + +```sql +REPLACE INTO + EXTERN(<destination function>) +AS CSV +OVERWRITE ALL +SELECT + <column> +FROM <table> +``` + +Exporting is currently supported for Amazon S3 storage. This can be done passing the function `S3()` as an argument to the `EXTERN` function. The `druid-s3-extensions` should be loaded. + +```sql +INSERT INTO + EXTERN(S3(bucket=<...>, prefix=<...>, tempDir=<...>)) Review Comment: I think this example would be clearer if you used syntax that would actually parse like: `EXTERN(S3(bucket='s3://your_bucket, prefix='prefix/to/files', tempDir='/var'))` Otherwise it is very hard to understand what actually needs to go in there. I have read these docs and I still do not understand if the values are quoted with `'` or `"`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
