dimas-b commented on code in PR #724:
URL: https://github.com/apache/polaris/pull/724#discussion_r1919160583


##########
service/common/src/main/java/org/apache/polaris/service/catalog/BasePolarisCatalog.java:
##########
@@ -1627,6 +1627,12 @@ private FileIO refreshIOWithCredentials(
     // the credentials should always override table-level properties, since
     // storage configuration will be found at whatever entity defines it
     tableProperties.putAll(credentialsMap);
+
+    // Populate the internal properties to FileIO in case the FileIO 
implementation needs them
+    Map<String, String> internalProperties =
+        
storageInfoEntity.map(PolarisEntity::getInternalPropertiesAsMap).orElse(Map.of());

Review Comment:
   TBH, I do not see much difference in passing a `Map` vs. 
`PolarisResolvedPathWrapper` to `FileIOFactory` because both approaches use 
essentially type-less containers of generic properties and the meaning of those 
properties is _implicit_ and subject to runtime / build variations.
   
   Would it be possible to have something like 
`FileIOFactory.loadFileIO(PolarisCatalogConfig config)` where 
`PolarisCatalogConfig` would have a `type` method returing `S3`, `GCS`, etc... 
plus typed "credentials", etc.?
   
   The idea being that `PolarisCatalogConfig` is strongly typed in Polaris core 
and persisted. `FileIOFactory` implementations use it to make Iceberg FileIO 
property bags in runtime. When the server is upgraded, the _same_ 
`PolarisCatalogConfig` is loaded, but factory implementations may be 
_different_ and initialize FileIO objects differently. However, `FileIOFactory` 
implementations still follow the contract of `PolarisCatalogConfig`. OSS 
implementations would use only the strongly typed part of 
`PolarisCatalogConfig` for operation.
   
   If a particular `FileIOFactory` implementation needs some config that is not 
yet defined in Polaris core, and does not "fit" in OSS, the config would use 
the `type` of `MY-CUSTOM-S3` (as an example) and provide for extra generic 
properties. Ideally, the parsing of config properties (on loading from 
persistence) is also delegated to a pluggable ckass that knows how to deal with 
`MY-CUSTOM-S3`. WDYT?
   
   If that is too much, for this PR, but you agree in principle, how about 
removing `String impl` from `FileIOFactory.loadFileIO` and adding a well-know 
type property plus javadoc that the factory is responsible for interpreting the 
`type`. Then, we could pass both "normal" and "internal" properties to 
`FileIOFactory.loadFileIO` as a short-term solution.
   
   ```
   public interface FileIOFactory {
     // type is specifically NOT the Iceberg FileIO class name
     FileIO loadFileIO(String type, Map<String, String> properties, Map<String, 
String> internalProperties);
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to