Re: [PR] add google as external storage for msq export (druid)

2024-04-05 Thread via GitHub


abhishekagarwal87 merged PR #16051:
URL: https://github.com/apache/druid/pull/16051


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-04-05 Thread via GitHub


adarshsanjeev commented on PR #16051:
URL: https://github.com/apache/druid/pull/16051#issuecomment-2039043960

   Yes, the changes look good to me! Thanks for the PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-04-05 Thread via GitHub


abhishekagarwal87 commented on PR #16051:
URL: https://github.com/apache/druid/pull/16051#issuecomment-2039026288

   @adarshsanjeev - If its looks good to you, can we merge this? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-04-05 Thread via GitHub


abhishekagarwal87 commented on code in PR #16051:
URL: https://github.com/apache/druid/pull/16051#discussion_r1552981537


##
extensions-core/google-extensions/src/main/java/org/apache/druid/storage/google/output/GoogleExportConfig.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.google.output;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.druid.java.util.common.HumanReadableBytes;
+
+import javax.annotation.Nullable;
+import java.util.List;
+
+public class GoogleExportConfig

Review Comment:
   Ah right. I see it now. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-04-04 Thread via GitHub


pjain1 commented on PR #16051:
URL: https://github.com/apache/druid/pull/16051#issuecomment-2037614185

   @adarshsanjeev yes tested on a cluster


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-04-04 Thread via GitHub


pjain1 commented on code in PR #16051:
URL: https://github.com/apache/druid/pull/16051#discussion_r1552004545


##
extensions-core/google-extensions/src/main/java/org/apache/druid/storage/google/output/GoogleExportConfig.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.google.output;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.druid.java.util.common.HumanReadableBytes;
+
+import javax.annotation.Nullable;
+import java.util.List;
+
+public class GoogleExportConfig

Review Comment:
   Just followed what `S3ExportStorageProvider` does. I think the reason for 
separate config is that these configs needs to be injected while creating 
`ExportStorageProvider` instance in the sql planning phase.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-04-04 Thread via GitHub


abhishekagarwal87 commented on code in PR #16051:
URL: https://github.com/apache/druid/pull/16051#discussion_r1550988691


##
extensions-core/google-extensions/src/main/java/org/apache/druid/storage/google/output/GoogleExportConfig.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.google.output;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.druid.java.util.common.HumanReadableBytes;
+
+import javax.annotation.Nullable;
+import java.util.List;
+
+public class GoogleExportConfig

Review Comment:
   what is the need to have this class than having these properties directly 
passed into GoogleExportStorageConnector? It would be nice to have uniformity 
in various cloud provider implementations. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-03-28 Thread via GitHub


pjain1 commented on PR #16051:
URL: https://github.com/apache/druid/pull/16051#issuecomment-2025650814

   @adarshsanjeev anything else ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-03-28 Thread via GitHub


pjain1 commented on PR #16051:
URL: https://github.com/apache/druid/pull/16051#issuecomment-2025647170

   thanks @317brian added your commits 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-03-13 Thread via GitHub


317brian commented on code in PR #16051:
URL: https://github.com/apache/druid/pull/16051#discussion_r1523793377


##
docs/multi-stage-query/reference.md:
##
@@ -149,6 +149,39 @@ The following runtime parameters must be configured to 
export into an S3 destina
 | `druid.export.storage.s3.maxRetry`   | No   | Defines the max 
number times to attempt S3 API calls to avoid failures due to transient errors. 

 | 10  |
 | `druid.export.storage.s3.chunkSize`  | No   | Defines the size 
of each chunk to temporarily store in `tempDir`. The chunk size must be between 
5 MiB and 5 GiB. A large chunk size reduces the API calls to S3, however it 
requires more disk space to store the temporary chunks. | 100MiB |
 
+
+# GS
+
+Export results to GCS by passing the function `google()` as an argument to the 
`EXTERN` function. Note that this requires the `druid-google-extensions`.
+The `google()` function is a Druid function that configures the connection. 
Arguments for `google()` should be passed as named parameters with the value in 
single quotes like the following example:
+
+```sql
+INSERT INTO
+  EXTERN(
+google(bucket => 'your_bucket', prefix => 'prefix/to/files')
+  )
+AS CSV
+SELECT
+  
+FROM 
+```
+
+Supported arguments for the function:
+
+| Parameter   | Required | Description 


   | Default |
+|-|--||-|
+| `bucket`| Yes  | The GS bucket to which the files are exported to. 
The bucket and prefix combination should be whitelisted in 
`druid.export.storage.google.allowedExportPaths`.   

  | n/a |
+| `prefix`| Yes  | Path where the exported files would be created. The 
export query expects the destination to be empty. If the location includes 
other files, then the query will fail. The bucket and prefix combination should 
be whitelisted in `druid.export.storage.google.allowedExportPaths`. | n/a |
+
+The following runtime parameters must be configured to export into an S3 
destination:

Review Comment:
   ```suggestion
   The following runtime parameters must be configured to export into a GCS 
destination:
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-03-13 Thread via GitHub


317brian commented on PR #16051:
URL: https://github.com/apache/druid/pull/16051#issuecomment-1995447062

   Some minor copyedit nits. The only absolutely must fix one is the incorrect 
cloud provider being listed in an intro sentence


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-03-13 Thread via GitHub


317brian commented on code in PR #16051:
URL: https://github.com/apache/druid/pull/16051#discussion_r1523793377


##
docs/multi-stage-query/reference.md:
##
@@ -149,6 +149,39 @@ The following runtime parameters must be configured to 
export into an S3 destina
 | `druid.export.storage.s3.maxRetry`   | No   | Defines the max 
number times to attempt S3 API calls to avoid failures due to transient errors. 

 | 10  |
 | `druid.export.storage.s3.chunkSize`  | No   | Defines the size 
of each chunk to temporarily store in `tempDir`. The chunk size must be between 
5 MiB and 5 GiB. A large chunk size reduces the API calls to S3, however it 
requires more disk space to store the temporary chunks. | 100MiB |
 
+
+# GS
+
+Export results to GCS by passing the function `google()` as an argument to the 
`EXTERN` function. Note that this requires the `druid-google-extensions`.
+The `google()` function is a Druid function that configures the connection. 
Arguments for `google()` should be passed as named parameters with the value in 
single quotes like the following example:
+
+```sql
+INSERT INTO
+  EXTERN(
+google(bucket => 'your_bucket', prefix => 'prefix/to/files')
+  )
+AS CSV
+SELECT
+  
+FROM 
+```
+
+Supported arguments for the function:
+
+| Parameter   | Required | Description 


   | Default |
+|-|--||-|
+| `bucket`| Yes  | The GS bucket to which the files are exported to. 
The bucket and prefix combination should be whitelisted in 
`druid.export.storage.google.allowedExportPaths`.   

  | n/a |
+| `prefix`| Yes  | Path where the exported files would be created. The 
export query expects the destination to be empty. If the location includes 
other files, then the query will fail. The bucket and prefix combination should 
be whitelisted in `druid.export.storage.google.allowedExportPaths`. | n/a |
+
+The following runtime parameters must be configured to export into an S3 
destination:

Review Comment:
   ```suggestion
   The following runtime parameters must be configured to export into an GS 
destination:
   ```



##
docs/multi-stage-query/reference.md:
##
@@ -149,6 +149,39 @@ The following runtime parameters must be configured to 
export into an S3 destina
 | `druid.export.storage.s3.maxRetry`   | No   | Defines the max 
number times to attempt S3 API calls to avoid failures due to transient errors. 

 | 10  |
 | `druid.export.storage.s3.chunkSize`  | No   | Defines the size 
of each chunk to temporarily store in `tempDir`. The chunk size must be between 
5 MiB and 5 GiB. A large chunk size reduces the API calls to S3, however it 
requires more disk space to store the temporary chunks. | 100MiB |
 
+
+# GS
+
+Export results to GCS by passing the function `google()` as an argument to the 
`EXTERN` function. Note that this requires the `druid-google-extensions`.
+The `google()` function is a Druid function that configures the connection. 
Arguments for `google()` should be passed as named parameters with the value in 
single quotes like the following example:
+
+```sql
+INSERT INTO
+  EXTERN(
+google(bucket => 'your_bucket', prefix => 'prefix/to/files')
+  )
+AS CSV
+SELECT
+  
+FROM 
+```
+
+Supported arguments for the function:
+
+| Parameter   | Required | Description 


   | Default |
+|-|--||-|
+| `bucket`| Yes  | The GS bucket to which the files are exported to. 
The bucket and prefix combination 

Re: [PR] add google as external storage for msq export (druid)

2024-03-13 Thread via GitHub


pjain1 commented on code in PR #16051:
URL: https://github.com/apache/druid/pull/16051#discussion_r1523124198


##
extensions-core/google-extensions/src/main/java/org/apache/druid/storage/google/output/GoogleExportStorageProvider.java:
##
@@ -0,0 +1,148 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.google.output;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.google.common.annotations.VisibleForTesting;
+import org.apache.druid.data.input.impl.CloudObjectLocation;
+import org.apache.druid.error.DruidException;
+import org.apache.druid.java.util.common.StringUtils;
+import org.apache.druid.storage.ExportStorageProvider;
+import org.apache.druid.storage.StorageConnector;
+import org.apache.druid.storage.google.GoogleInputDataConfig;
+import org.apache.druid.storage.google.GoogleStorage;
+import org.apache.druid.storage.google.GoogleStorageDruidModule;
+
+import javax.validation.constraints.NotNull;
+import java.io.File;
+import java.net.URI;
+import java.util.List;
+
+@JsonTypeName(GoogleExportStorageProvider.TYPE_NAME)
+public class GoogleExportStorageProvider implements ExportStorageProvider
+{
+  public static final String TYPE_NAME = GoogleStorageDruidModule.SCHEME;

Review Comment:
   it was not public, made it public 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-03-13 Thread via GitHub


pjain1 commented on code in PR #16051:
URL: https://github.com/apache/druid/pull/16051#discussion_r1523115949


##
docs/multi-stage-query/reference.md:
##
@@ -149,6 +149,39 @@ The following runtime parameters must be configured to 
export into an S3 destina
 | `druid.export.storage.s3.maxRetry`   | No   | Defines the max 
number times to attempt S3 API calls to avoid failures due to transient errors. 

 | 10  |
 | `druid.export.storage.s3.chunkSize`  | No   | Defines the size 
of each chunk to temporarily store in `tempDir`. The chunk size must be between 
5 MiB and 5 GiB. A large chunk size reduces the API calls to S3, however it 
requires more disk space to store the temporary chunks. | 100MiB |
 
+
+# GS
+
+Export results to GCS by passing the function `google()` as an argument to the 
`EXTERN` function. Note that this requires the `druid-google-extensions`.
+The `google()` function is a Druid function that configures the connection. 
Arguments for `google()` should be passed as named parameters with the value in 
single quotes like the following example:
+
+```sql
+INSERT INTO
+  EXTERN(
+google(bucket => 'your_bucket', prefix => 'prefix/to/files')
+  )
+AS CSV
+SELECT
+  
+FROM 
+```
+
+Supported arguments for the function:
+
+| Parameter   | Required | Description 


   | Default |
+|-|--||-|
+| `bucket`| Yes  | The GS bucket to which the files are exported to. 
The bucket and prefix combination should be whitelisted in 
`druid.export.storage.google.allowedExportPaths`.   

  | n/a |
+| `prefix`| Yes  | Path where the exported files would be created. The 
export query expects the destination to be empty. If the location includes 
other files, then the query will fail. The bucket and prefix combination should 
be whitelisted in `druid.export.storage.google.allowedExportPaths`. | n/a |
+
+The following runtime parameters must be configured to export into an S3 
destination:
+
+| Runtime Parameter| Required | Description


  | Default |
+|--|--|--|-|
+| `druid.export.storage.google.tempLocalDir`   | Yes  | Directory used 
on the local storage of the worker to store temporary files required while 
uploading the data. 
   | n/a |
+| `druid.export.storage.google.allowedExportPaths` | Yes  | An array of GS 
prefixes that are whitelisted as export destinations. Export queries fail if 
the export destination does not match any of the configured prefixes. Example: 
`[\"gs://bucket1/export/\", \"gs://bucket2/export/\"]`| n/a |
+| `druid.export.storage.google.maxRetry`   | No   | Defines the 
max number times to attempt GS API calls to avoid failures due to transient 
errors. 
 | 10  |
+| `druid.export.storage.gooel.chunkSize`   | No   | Defines the 
size of each chunk to temporarily store in `tempDir`. The chunk size must be 
between 5 MiB and 5 GiB. A large chunk size reduces the API calls to GS, 
however it requires more disk space to store the temporary chunks. | 100MiB |

Review Comment:
   its 4MiB, fixed docs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] add google as external storage for msq export (druid)

2024-03-13 Thread via GitHub


adarshsanjeev commented on code in PR #16051:
URL: https://github.com/apache/druid/pull/16051#discussion_r1522634321


##
docs/multi-stage-query/reference.md:
##
@@ -149,6 +149,39 @@ The following runtime parameters must be configured to 
export into an S3 destina
 | `druid.export.storage.s3.maxRetry`   | No   | Defines the max 
number times to attempt S3 API calls to avoid failures due to transient errors. 

 | 10  |
 | `druid.export.storage.s3.chunkSize`  | No   | Defines the size 
of each chunk to temporarily store in `tempDir`. The chunk size must be between 
5 MiB and 5 GiB. A large chunk size reduces the API calls to S3, however it 
requires more disk space to store the temporary chunks. | 100MiB |
 
+
+# GS
+
+Export results to GCS by passing the function `google()` as an argument to the 
`EXTERN` function. Note that this requires the `druid-google-extensions`.
+The `google()` function is a Druid function that configures the connection. 
Arguments for `google()` should be passed as named parameters with the value in 
single quotes like the following example:
+
+```sql
+INSERT INTO
+  EXTERN(
+google(bucket => 'your_bucket', prefix => 'prefix/to/files')
+  )
+AS CSV
+SELECT
+  
+FROM 
+```
+
+Supported arguments for the function:
+
+| Parameter   | Required | Description 


   | Default |
+|-|--||-|
+| `bucket`| Yes  | The GS bucket to which the files are exported to. 
The bucket and prefix combination should be whitelisted in 
`druid.export.storage.google.allowedExportPaths`.   

  | n/a |
+| `prefix`| Yes  | Path where the exported files would be created. The 
export query expects the destination to be empty. If the location includes 
other files, then the query will fail. The bucket and prefix combination should 
be whitelisted in `druid.export.storage.google.allowedExportPaths`. | n/a |
+
+The following runtime parameters must be configured to export into an S3 
destination:
+
+| Runtime Parameter| Required | Description


  | Default |
+|--|--|--|-|
+| `druid.export.storage.google.tempLocalDir`   | Yes  | Directory used 
on the local storage of the worker to store temporary files required while 
uploading the data. 
   | n/a |
+| `druid.export.storage.google.allowedExportPaths` | Yes  | An array of GS 
prefixes that are whitelisted as export destinations. Export queries fail if 
the export destination does not match any of the configured prefixes. Example: 
`[\"gs://bucket1/export/\", \"gs://bucket2/export/\"]`| n/a |
+| `druid.export.storage.google.maxRetry`   | No   | Defines the 
max number times to attempt GS API calls to avoid failures due to transient 
errors. 
 | 10  |
+| `druid.export.storage.gooel.chunkSize`   | No   | Defines the 
size of each chunk to temporarily store in `tempDir`. The chunk size must be 
between 5 MiB and 5 GiB. A large chunk size reduces the API calls to GS, 
however it requires more disk space to store the temporary chunks. | 100MiB |

Review Comment:
   The value of 100MiB as default has sometimes led to OOM issues. It should be 
lowered to 4MiB like 
org.apache.druid.storage.google.output.GoogleOutputConfig#DEFAULT_CHUNK_SIZE.



##
extensions-core/google-extensions/src/main/java/org/apache/druid/storage/google/output/GoogleExportStorageProvider.java:
##
@@ -0,0 +1,148 @@

Re: [PR] add google as external storage for msq export (druid)

2024-03-06 Thread via GitHub


adarshsanjeev commented on PR #16051:
URL: https://github.com/apache/druid/pull/16051#issuecomment-1980347387

   Thanks for the PR! I'll take a look at it shortly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org