I do like the simplicity. Unfortunately, I don’t think it meets the “secure by default” test. An admin may enable unstructured locations without understanding the credential vending implications. Now, in your scenario, if the admin enabled unstructured table locations and that implicitly turned off credential vending until the admin explicitly enabled credential vending, I think we’d have a good, safe default. TBH, that might have been the way to go when we added the unstructured locations without understanding flag in the first place.
I think there are multiple considerations to balance: 1. I think things should work by default - when the user creates the catalog (INTERNAL or EXTERNAL), tables and namespaces should just work right away. 2. Things should be secure by default - that is, admins should not have to worry about whether the credentials they vend grant more privileges than they ought to. 3. Maintain as few knobs as possible (I think this is where you’re coming from). IMO, forcing admins to turn on unstructured table locations for EXTERNAL catalogs breaks the first principle just because so many existing catalogs already use unstructured table locations. Furthermore, making this change would break users who are already using Polaris to query their external catalogs (although, that kinda is the topic of this thread). Asking them to then DISable credential vending violates the second principle because the default value is less secure than the other option. In my mind we should have two flags: one disable credential vending (enabled by default) for all catalogs and one to enable it for external catalogs (disabled by default). I think that meets the first two requirements, without straining the third. Mike On Tue, Oct 22, 2024 at 8:15 PM Eric Maynard <eric.w.mayn...@gmail.com> wrote: > I see. I am imagining a scenario like the following: > > I'm a Polaris admin who wants to add a new external catalog that points to > an existing Glue catalog. Since I know the tables in that catalog occupy > all sorts of wacky locations, I specify *ALLOW_UNSTRUCTURED_TABLE_LOCATION* > as true when I create the catalog. In order to protect against one table's > credentials being used to read another table, I further specify > *ALLOW_CREDENTIAL_VENDING* as false. > > Later, I want to create an internal catalog. For whatever reason, I > anticipate that catalog will have tables in wacky locations, too. I set > the same two flags mentioned above the same way. I get the same behavior. > > Does the above workflow make sense to you? I like the simplicity of this > approach. > > On Tue, Oct 22, 2024 at 5:08 PM Michael Collado <collado.m...@gmail.com> > wrote: > > > Admins don’t have the knob because they don’t control the structure of > the > > remote catalog. This could be something like Glue or Tabular that > generate > > UUIDs for their table locations. Or it can be entirely user controller > > where tables are randomly scattered across the bucket. E.g., maybe the > user > > wanted to use object store layout and write to random locations. > > > > Admins may want to expose these catalogs to provide a single access point > > for their users and to provide some consistency in their RBAC, but may > not > > want to rely on vended credentials since they cannot guarantee the > > application of those credentials. Admins should retain the ability to > > support vended credentials for cases where there is no overlap and should > > also be able to expose catalogs where we cannot prevent overlap. Those > > catalogs should be controlled differently. > > > > Mike > > > > On Tue, Oct 22, 2024 at 4:53 PM Eric Maynard <eric.w.mayn...@gmail.com> > > wrote: > > > > > > They do not have such a knob for EXTERNAL catalogs > > > > > > This by itself seems like a problem to me. > > > > > > It seems like > > > < > > > > > > https://github.com/apache/polaris/blob/02468e0bed230641893c2bd45cde790dfd2ca76c/polaris-service/src/main/java/org/apache/polaris/service/admin/PolarisAdminService.java#L540 > > > > > > > we > > > do check ALLOW_OVERLAPPING_CATALOG_URLS for both internal and external > > > catalogs. The check for ALLOW_UNSTRUCTURED_TABLE_LOCATION is skipped > here > > > < > > > > > > https://github.com/apache/polaris/blob/02468e0bed230641893c2bd45cde790dfd2ca76c/polaris-core/src/main/java/org/apache/polaris/core/storage/PolarisStorageConfigurationInfo.java#L173 > > > > > > > for > > > external catalogs. Why is that? > > > > > > If we respected ALLOW_UNSTRUCTURED_TABLE_LOCATION for external > catalogs, > > it > > > seems like admins would have that knob you mentioned. Since #264 > > > <https://github.com/apache/polaris/pull/264> we allow overrides of > this > > on > > > the catalog level, so admins that wanted to allow unsecure overlapping > > > table locations in external catalogs they could do so. > > > > > > On Tue, Oct 22, 2024 at 4:38 PM Michael Collado < > collado.m...@gmail.com> > > > wrote: > > > > > > > I responded to your comment on the PR already. I think a flag to > > disable > > > > credential vending for INTERNAL catalogs does make sense, but I think > > it > > > > should be a separate flag. > > > > > > > > Service admins already have the knobs to make an INTERNAL catalog > > secure > > > by > > > > default by enforcing non-overlapping table locations. They have to go > > out > > > > of their way to allow table overlap. > > > > > > > > They do not have such a knob for EXTERNAL catalogs. A single flag for > > > > disabling all credential vending means that an admin cannot securely > > > > configure INTERNAL catalogs and support credential vending without > also > > > > allowing for potentially insecure EXTERNAL catalogs. The controls for > > > each > > > > type of catalog are different, so they should be configured > > differently. > > > > > > > > Again, I think such a flag is a good idea, but should be a separate > > flag > > > > and belongs in another PR. > > > > > > > > Mike > > > > > > > > On Tue, Oct 22, 2024 at 4:25 PM Eric Maynard < > eric.w.mayn...@gmail.com > > > > > > > wrote: > > > > > > > > > I commented the same on the PR, but it’s not obvious to me why we > > would > > > > > make this exclusive to external catalogs. You can create internal > > > > catalogs > > > > > with overlapping locations too. > > > > > > > > > > A single flag to disable credential vending altogether solves this > > > > problem > > > > > across the board and also allows for finer-grained control if these > > > > > overrides are ever respected for namespaces or tables. > > > > > > > > > > Other flags like the ones to allow location overlap or to disable > > > > > subscoping are not specific to a catalog type. > > > > > > > > > > On Tue, Oct 22, 2024 at 3:15 PM Michael Collado < > > > collado.m...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hey folks > > > > > > > > > > > > I opened a PR at https://github.com/apache/polaris/pull/395 to > > > support > > > > > > disabling credential vending for external catalogs. > > > > > > > > > > > > Currently, some remote catalogs don't control the table location > > for > > > > > > Iceberg tables, allowing people to create tables in overlapping > > > > > directories > > > > > > in storage. A person that has storage credentials to one table > > > > > necessarily > > > > > > has access to all of the tables that overlap. This is already an > > > issue > > > > > for > > > > > > Iceberg users. > > > > > > > > > > > > However, with Polaris, some users expect that table-level RBAC > and > > > > > > credential vending will somehow magically resolve their security > > > issues > > > > > so > > > > > > that they can tightly control access despite overlapping table > > > > locations. > > > > > > This is obviously a misconception. > > > > > > > > > > > > But in the spirit of making things secure by default, I am > adding a > > > > > > configuration flag to disable credential vending for external > > > catalogs. > > > > > As > > > > > > with the current configuration to disable overlapping table > > > locations, > > > > > > there is a catalog-level override so that catalog admins can > enable > > > > > > credential vending on a case-by-case basis if they feel the risk > is > > > > > > tolerable. > > > > > > > > > > > > Changing the configuration to disable credential vending by > default > > > is > > > > a > > > > > > breaking change, so I'd like to see what everyone thinks about > > making > > > > > that > > > > > > change. Currently, the PR does _not_ change the default value, so > > we > > > > can > > > > > > merge it as is, without any impact on existing users. But unless > > > users > > > > > > really dig into the code or the documentation, they may not > > > understand > > > > > the > > > > > > implications well enough to make the more secure configuration > > > setting. > > > > > > > > > > > > WDYT? > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > >