AdarshKadameriTR opened a new issue, #7487:
URL: https://github.com/apache/hudi/issues/7487

   We have hudi tables in S3 buckets and each buckets are having 40+ Terra 
bytes data and some having 80+ TB. Recently or application is failing with 
error when reading  S3 buckets. When we contacted with AWS support they 
informed that we have reached quota limits in different occasions in the last 
24 hours and multiple times for the last 7 days. These buckets contains only 
hudi tables. How does we are reaching quota limits and which api quota has to 
be increased? 
   
   AWS Case ID 11532026531.
   
   **Environment Description**
   
   *EMR Version: emr-6.7.0
   
   * Hudi version : 0.11.1
   
   * Spark version : Spark 3.2.1
   
   * Hive version : Hive 3.1.3
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : No
   
   **Stacktrace**
   
   `2022-12-14T19:50:30.207+0000 [INFO] [1670994048838prod_correlation_id] 
[com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor] 
[GlobalS3Executor]: ReadTimeout File: 
xxxxxxxxxx/xxxxxxxxxx/5301b299-7abc-4230-8e23-ca7128074103-3_1133-1242-812796_20221214084918037.parquet;
 Range: [48449360, 48884521] Use default timeout configuration to retry for 
read timeout {} 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to 
execute HTTP request: Read timed out at 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1216)
 at 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1162)
 at 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
 at 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(
 AmazonHttpClient.java:779) at 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(Amazon
   
/xxxxxxxxxx/5301b299-7abc-4230-8e23-ca7128074103-3_1133-1242-812796_20221214084918037.parquet'
 for reading 2022-12-14T19:50:26.920+0000 [INFO] 
[1670994048838prod_correlation_id] 
[com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor] 
[GlobalS3Executor]: ReadTimeout File: 
xxxxxxxxxx/xxxxxxxxxx/a7c86b71-a3f9-43e5-a3ae-ca9eefbd6f78-2_846-952-645550_20221214074208046.parquet;
 Range: [136089626, 136526893] Use default timeout configuration to retry for 
read timeout {} 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to 
execute HTTP request: Read timed out at 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1216)
 at 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1162)
 at 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
 a
 t 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
 at 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
 at 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$
   
   [1670994048838prod_correlation_id] 
[com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor] 
[GlobalS3Executor]: ReadTimeout File: 
xxxxxxxxxx/xxxxxxxxxx/a7c86b71-a3f9-43e5-a3ae-ca9eefbd6f78-2_846-952-645550_20221214074208046.parquet;
 Range: [136089626, 136526893] Use default timeout configuration to retry for 
read timeout {}
   
   `
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to