MonkeyCanCode commented on code in PR #3494:
URL: https://github.com/apache/polaris/pull/3494#discussion_r2724495944


##########
runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java:
##########
@@ -1409,6 +1409,22 @@ public void doRefresh() {
                       Set.of(PolarisStorageActions.READ, 
PolarisStorageActions.LIST));
               return TableMetadataParser.read(fileIO, metadataLocation);
             });
+
+        // After a refresh, re-load the FileIO with the new table metadata 
properties to
+        // ensure the right permissions are present for subsequent file system 
interactions.
+        if (currentMetadata != null) {
+          tableFileIO =
+              loadFileIOForTableLike(
+                  tableIdentifier,
+                  StorageUtil.getLocationsUsedByTable(currentMetadata),
+                  resolvedEntities,
+                  new HashMap<>(currentMetadata.properties()),
+                  Set.of(
+                      PolarisStorageActions.READ,
+                      PolarisStorageActions.WRITE,
+                      PolarisStorageActions.LIST));

Review Comment:
   So I tried to implement the fix but it ends up being a major refactor and we 
may needed a delegate FileIO class. 
   
   Based on my understanding, the issue here is for that specific setup (I also 
tried to see if I can get something similar setup on my local for validation 
but no luck so far as setup details are not being shared nor a minimal 
reproducible), after spark use its RO credential with assume role privilege to 
assume the role associated with catalog, it won't be able to use the STS token 
for insert thus falled back to client id/secret associated with spark client. 
Which in this case, it can only happened after the metadata refresh (as it will 
need to be call first before insert can happen based on my understanding). Then 
spark will use the returned FileIO from function `io()` which can lost write 
access (thus the initial fix worked). 
   
   However, for the setup I have on AWS, the client role always doesn't have 
write access to the bucket directly and it uses assume role with the IAM role 
associated with the catalog for all operations. So I believed this may be very 
setup specific for this issue.
   
   I am bit hesitated to do the major refactor for delegate FileIO (so 
permission can be determined based on context such as RO for 
`io().newInputFile()` and RW for `io().newOutputFile()`) as I don't have a 
setup which I can test tp validate as well as future refactors which can break 
this silently as well without some proper test cases/setups.
   
   Maybe we should have someone who is more familiar with this code path to 
take another look in case I missed something here. cc @dimas-b @evindj @flyrain 
   
    



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to