adutra commented on issue #3621:
URL: https://github.com/apache/polaris/issues/3621#issuecomment-3826506144

   To be honest, I'm not 100% convinced of the usefulness of Polaris' object 
layout. 
   
   S3 is said to support 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests 
per second per "partitioned prefix":
   
   
https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html
   
   But we don't know exactly what a "partitioned prefix" is. For data file 
hotspots: Iceberg's layout is likely sufficient. For metadata file hotspots: 
Polaris's feature might provide additional value since Iceberg's layout doesn't 
apply to these files, but metadata operations are typically lower volume than 
data writes.
   
   The following questions could help clarify the feature's value:
   
   1. Was this feature introduced based on empirical evidence of improved S3 
performance, or based 
      on theoretical assumptions about prefix distribution?
   2. Are there benchmarks or case studies showing that per-table entropy 
prefixes reduces hotspots
      or improve throughput?
   3. Does AWS documentation or support confirm that prefix diversity helps S3 
partition more 
      effectively?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to