dennishuo commented on code in PR #2422:
URL: https://github.com/apache/polaris/pull/2422#discussion_r2301700712


##########
runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java:
##########
@@ -508,6 +508,11 @@ private void createNamespaceInternal(
     } else {
       LOGGER.debug("Skipping location overlap validation for namespace '{}'", 
namespace);
     }
+    if (!realmConfig.getConfig(
+        FeatureConfiguration.ALLOW_NAMESPACE_CUSTOM_LOCATION, catalogEntity)) {

Review Comment:
   The original design was always to still securely support a situation where 
people might want to remap all their namespaces/tables into UUID paths, as long 
as they reside under the ALLOWED_LOCATIONS defined within the StorageConfig.
   
   In this PR we need to be very precise about what we mean by "custom 
location". As I understand it, "custom" just means it doesn't follow the 
"default" where the pathname is equal to the namespace name hierarchically:
   
   `ns1.ns2` -> `s3://basebucket/basepath1/ns1/ns2/`
   `StorageConfig.defaultBaseLocation = s3://basebucket/basepath1`
   `StorageConfig.allowedLocations= [s3://basebucket/basepath1, 
s3://basebucket/basepath2]`
   
   If we allow a "custom location", a user might want:
   `ns1.ns2` -> `s3://basebucket/basepath1/6f8af834t/`
   `StorageConfig.defaultBaseLocation = s3://basebucket/basepath1`
   `StorageConfig.allowedLocations= [s3://basebucket/basepath1, 
s3://basebucket/basepath2]`
   
   In such a case we must still validate that the "custom" location  
`s3://basebucket/basepath1/6f8af834t/` resides under *one of* the allowed 
locations in the StorageConfig, and by default we'd still also perform a check 
to prevent overlaps.
   
   So there's three completely distinct concepts:
   
   1. "Custom paths" -> Use supplied paths that might be uuids, rather than 
"default" name-friendly paths. Disabling this is fairly commonplace, and is 
more a *stylistic* choice by the user
   2. "Overlapping paths check" -> Whether paths are custom or follow default 
name-friendly paths, we perform overlap checks so that RBAC can "correctly" 
uniquely identify a storage prefix during credential-vending. In theory, if 
*all* paths only use default name-based paths, we don't need overlap checks, 
but since "custom paths" allows someone to already remap some other namespace 
to the path before creating a new namespace using "default name-based paths", 
in general the overlap check may still be needed. Catalog admins may disable 
this for performance if they have full control over path specification - this 
is a *declarative* security option, where a catalog admin declares whether 
paths are fully under trusted control
   3. "Allowed locations check" -> No matter what mix of custom/default 
locations is used for all namespaces and tables, the primary invariant is that 
all credential vending *must* only be under one or more of the declared 
ALLOWED_LOCATIONS here. Credential vending is used *internally* as well for 
Polaris to access metadata json, so this also applies for CreateTable even if 
the caller isn't asking for vended-credentials externally. This option should 
*not* generally be configurable for disablement. If a catalog admin wishes for 
a new path to be allowed in a catalog, they should update the StorageConfig to 
add that ALLOWED_LOCATION.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@polaris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to