andygrove opened a new pull request, #3802:
URL: https://github.com/apache/datafusion-comet/pull/3802

   ## Which issue does this PR close?
   
   <!--Closes #.-->
   
   ## Rationale for this change
   
   When reading Parquet files from S3, each call to `initRecordBatchReader` 
creates a new `SessionContext`, `RuntimeEnv`, and S3 `ObjectStore` client. Each 
`ObjectStore` instance creates a new `reqwest` HTTP client with its own 
connection pool, requiring fresh DNS resolution for every file opened.
   
   In TPC-DS-scale workloads with thousands of Parquet files, this generates 
excessive DNS queries that can overwhelm DNS resolvers (e.g., Route53 Resolver 
limits in EKS environments), causing `UnknownHostException` errors, 
intermittent S3 connectivity failures, and job failures under high concurrency.
   
   Native Spark does not have this problem because Hadoop's `S3AFileSystem` 
maintains singleton instances per filesystem URI (via `FileSystem.get()` cache) 
and benefits from JVM-level DNS caching. Comet's native Rust layer bypasses all 
of this.
   
   Additionally, when no region is explicitly configured, each file read 
triggers a `HeadBucket` API call with a throwaway `reqwest::Client` to 
auto-detect the bucket region, doubling the DNS query volume.
   
   ## What changes are included in this PR?
   
   Two caches are introduced in the native Rust layer:
   
   1. **ObjectStore cache** (`parquet_support.rs`): A global cache keyed by 
`(URL prefix, config hash)` in `prepare_object_store_with_configs()`. 
Subsequent reads from the same bucket with the same configuration reuse the 
existing `ObjectStore` instance, enabling HTTP connection pooling via `reqwest` 
and eliminating redundant DNS lookups.
   
   2. **Region resolution cache** (`s3.rs`): A per-bucket cache in 
`resolve_bucket_region()` that ensures the `HeadBucket` API call (used to 
auto-detect bucket region when not explicitly configured) happens only once per 
bucket per JVM lifetime.
   
   Together these changes reduce DNS query volume from O(num_files) to 
O(num_distinct_buckets) per executor.
   
   **Workaround for users on current versions:** Set `fs.s3a.endpoint.region` 
explicitly to avoid the `HeadBucket` auto-detection entirely.
   
   ## How are these changes tested?
   
   - Existing unit tests in `parquet_support.rs` continue to pass (they test 
local filesystem paths which exercise the same 
`prepare_object_store_with_configs` code path).
   - Clippy passes with no warnings.
   - The caching is transparent — it returns the same `ObjectStore` instances 
that would have been created without caching, so existing S3 integration tests 
cover correctness.
   - Manual validation recommended in an EKS environment with S3-backed TPC-DS 
workloads to confirm DNS query volume reduction.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to