nastra commented on code in PR #7548: URL: https://github.com/apache/iceberg/pull/7548#discussion_r1187206151
########## docs/aws.md: ########## @@ -288,25 +288,25 @@ This design has the following benefits: ### RDS JDBC Catalog -Iceberg also supports JDBC catalog which uses a table in a relational database to manage Iceberg tables. -You can configure to use JDBC catalog with relational database services like [AWS RDS](https://aws.amazon.com/rds). +Iceberg also supports the JDBC catalog which uses a table in a relational database to manage Iceberg tables. +You can configure to use the JDBC catalog with relational database services like [AWS RDS](https://aws.amazon.com/rds). Read [the JDBC integration page](../jdbc/#jdbc-catalog) for guides and examples about using the JDBC catalog. -Read [this AWS documentation](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.Connecting.Java.html) for more details about configuring JDBC catalog with IAM authentication. +Read [this AWS documentation](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.Connecting.Java.html) for more details about configuring the JDBC catalog with IAM authentication. ### Which catalog to choose? -With all the available options, we offer the following guidance when choosing the right catalog to use for your application: +With all the available options, we offer the following guidlines when choosing the right catalog to use for your application: Review Comment: typo in `guidlines` ########## docs/aws.md: ########## @@ -364,13 +364,13 @@ For more details, please read [S3 ACL Documentation](https://docs.aws.amazon.com ### Object Store File Layout S3 and many other cloud storage services [throttle requests based on object prefix](https://aws.amazon.com/premiumsupport/knowledge-center/s3-request-limit-avoid-throttling/). -Data stored in S3 with a traditional Hive storage layout can face S3 request throttling as objects are stored under the same filepath prefix. +Data stored in S3 with a traditional Hive storage layout can face S3 request throttling as objects are stored under the same file path prefix. -Iceberg by default uses the Hive storage layout, but can be switched to use the `ObjectStoreLocationProvider`. -With `ObjectStoreLocationProvider`, a determenistic hash is generated for each stored file, with the hash appended +Iceberg by default uses the Hive storage layout but can be switched to use the `ObjectStoreLocationProvider`. +With `ObjectStoreLocationProvider`, a deterministic hash is generated for each stored file, with the hash appended directly after the `write.data.path`. This ensures files written to s3 are equally distributed across multiple [prefixes](https://aws.amazon.com/premiumsupport/knowledge-center/s3-object-key-naming-pattern/) in the S3 bucket. Resulting in minimized throttling and maximized throughput for S3-related IO operations. When using `ObjectStoreLocationProvider` having a shared and short `write.data.path` across your Iceberg tables will improve performance. -For more information on how S3 scales API QPS, checkout the 2018 re:Invent session on [Best Practices for Amazon S3 and Amazon S3 Glacier]( https://youtu.be/rHeTn9pHNKo?t=3219). At [53:39](https://youtu.be/rHeTn9pHNKo?t=3219) it covers how S3 scales/partitions & at [54:50](https://youtu.be/rHeTn9pHNKo?t=3290) it discusses the 30-60 minute wait time before new partitions are created. +For more information on how S3 scales API QPS, check out the 2018 re: Invent session on [Best Practices for Amazon S3 and Amazon S3 Glacier]( https://youtu.be/rHeTn9pHNKo?t=3219). At [53:39](https://youtu.be/rHeTn9pHNKo?t=3219) it covers how S3 scales/partitions & at [54:50](https://youtu.be/rHeTn9pHNKo?t=3290) it discusses the 30-60 minute wait time before new partitions are created. Review Comment: I believe `re:Invent` is actually correct here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org