[ 
https://issues.apache.org/jira/browse/IMPALA-13358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boglarka Egyed updated IMPALA-13358:
------------------------------------
    Priority: Minor  (was: Major)

> Add prefixes when writing files in S3 compatible object stores
> --------------------------------------------------------------
>
>                 Key: IMPALA-13358
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13358
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend
>            Reporter: Manish Maheshwari
>            Priority: Minor
>              Labels: impala-iceberg
>
> AWS - 
> https://aws.amazon.com/blogs/big-data/best-practices-to-optimize-data-access-performance-from-amazon-emr-and-aws-glue-to-amazon-s3/
> {code:java}
> Amazon S3 performance isn’t defined per bucket, but per prefix in a bucket. 
> Your applications can achieve at least 3,500 PUT/COPY/POST/DELETE or 5,500 
> GET/HEAD requests per second per prefix in a bucket. Additionally, there are 
> no limits to the number of prefixes in a bucket, so you can horizontally 
> scale your read or write performance using parallelization. For example, if 
> you create 10 prefixes in an S3 bucket to parallelize reads, you could scale 
> your read performance to 55,000 read requests per second. You can similarly 
> scale writes by writing data across multiple prefixes.
>   {code}
> Impala should also support calculating and using prefixes to write 
> data/delete to object stores to make the subsequent scans faster.
> Once we enable this we can honour the iceberg table property - 
> `write.object-storage.enabled` to use with iceberg tables.
> Refer - 
> [https://aws.amazon.com/blogs/big-data/best-practices-to-optimize-data-access-performance-from-amazon-emr-and-aws-glue-to-amazon-s3/]
> https://www.dremio.com/blog/ensuring-high-performance-at-any-scale-with-apache-icebergs-object-store-file-layout/
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to