karuppayya commented on PR #4335:
URL: 
https://github.com/apache/datafusion-comet/pull/4335#issuecomment-4463758481

   @mbutrovich Appreciate the feedback — agree this is worth pinning down 
before going further.
   
    The main scenario is REST-catalog deployments where vended credentials and 
refresh config originate in the driver's catalog context.
   
     **1.** Iceberg already does exactly init+broadcast for `S3FileIO`. 
[`SparkBatch.planInputPartitions`](https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkBatch.java#L86-L92)
 broadcasts the configured FileIO and
     
[`S3FileIO.initialize(Map<String,String>)`](https://github.com/apache/iceberg/blob/main/aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java#L502)
 runs on the driver. This change follows the same pattern.
   
     **2.** I think REST makes this broadcast load-bearing( and not optional i 
think). 
[`LoadTableResponse.credentials`](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/rest/responses/LoadTableResponse.java#L47)
 are obtained driver-side and reach executors because 
[`RESTSessionCatalog.newFileIO` calls
     
`setCredentials(...)`](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java#L1187-L1208)
 and the FileIO is then broadcast. Refresh on the executor uses 
[`VendedCredentialsProvider` built from broadcast
     
properties](https://github.com/apache/iceberg/blob/main/aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java#L459-L478).
   
     **3.** Multi-catalog. This change keys broadcasts by catalog name, so 
per-catalog isolation works. 
[#4309](https://github.com/apache/datafusion-comet/pull/4309)'s SPI has no 
catalog identifier, so two catalogs sharing a vendor class silently collide on 
one cached instance — fixable though.
   
     **4.** [#4309](https://github.com/apache/datafusion-comet/pull/4309) 
requires every vendor to ship Comet-Spark plumbing. A REST-catalog vendor would 
need to redo much of what this change does i guess
     
      Curious what you think.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to