[GitHub] [hudi] AdarshKadameriTR commented on issue #7487: [SUPPORT] S3 Buckets reached quota limit when reading from hudi tables

GitBox Sun, 08 Jan 2023 23:20:34 -0800


AdarshKadameriTR commented on issue #7487:
URL: https://github.com/apache/hudi/issues/7487#issuecomment-1375198359

Hi @xushiyan ,

We are incrementally upserting data into our Hudi table/s every 5 minutes.
We have set **CLEANER_POLICY** as **KEEP_LATEST_BY_HOURS** with
**CLEANER_HOURS_RETAINED** = 48. The only command we execute is **Upsert** and
we have single writer and compaction **runs every hour**.

pls share more info like what the job is doing when this occurs - is it
reading or writing? :
Our application job is only doing write operation using upserts as mentioned
above. As per discussion with AWS they see s3 get API up to 700 times per
second. From the logs we can see Hudi internally is calling these get
operations on the log files in table partitions. Most likely Hudi compaction is
calling those read operations.

have you run clustering for this table?
We have **not enabled clustering** on the tables.

what do the writer configs look like?
Given in below screenshots

![210503366-77d47c7c-169f-4a87-8234-0971079a9347](https://user-images.githubusercontent.com/110987545/211257318-ac7a3c01-3fd7-445e-8aee-b103d9cf06c1.png)

![210501558-28eb3712-fed8-4c93-9c85-ccb6ef3521dc](https://user-images.githubusercontent.com/110987545/211257330-c2ffd236-c08a-4169-a651-4cd4c2b62dbe.png)

**Partition structure**: s3://bucket/table/partition/parquet and .log files

**Note**:- We have an open issue on old log files not getting cleaned by
hudi cleaner. **https://github.com/apache/hudi/issues/7600**

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] AdarshKadameriTR commented on issue #7487: [SUPPORT] S3 Buckets reached quota limit when reading from hudi tables

Reply via email to