[
https://issues.apache.org/jira/browse/HIVE-27714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HIVE-27714:
----------------------------------
Labels: check pull-request-available (was: check)
> Iceberg: metadata location overrides can cause data breach - handling default
> locations
> ----------------------------------------------------------------------------------------
>
> Key: HIVE-27714
> URL: https://issues.apache.org/jira/browse/HIVE-27714
> Project: Hive
> Issue Type: Sub-task
> Components: Authorization, Iceberg integration
> Affects Versions: 4.0.0-alpha-2
> Reporter: Janos Kovacs
> Assignee: Ayush Saxena
> Priority: Critical
> Labels: check, pull-request-available
>
> With current Iceberg location authorization one explicit ranger policy is
> required for every tables to prevent the cross-reference of metadata_location
> exploit as any wildcard based policy to cover set of tables would open up
> cross-referencing locations between tables covert by the wildcard.
> This is nearly impossible in a production environment.
> The proposal is to handle the Iceberg table RWStorage authorization a
> different way when the table is created/altered with it's default location as
> in this case there is no attempt for cross-referencing another table. There
> are two options for this:
> When?
> * If no custom metadata_location is set/given in the CREATE/ALTER calls
> * If the given metadata_locaiton's path (e.g. without the metadata json file
> name) is the same as the current metadata_location's path in the ALTER calls
> * If the given metadata_location's path set/given in CREATE/ALTEER calls is
> the same as the default location would be for the table based on the
> warehouse and/or database locations
> What
> # Either do not call the RWStorage Authorizer for this case
> # Or set the location to a constant value that can be easily handled with
> one single access policy on the Authorizer side
> Pros/Cons:
> * Option-1 would not call authorizer so it would not generate an audit even
> for these on RWStorage level policies but it would omit the Authorization
> step so it would be more performant
> * Option-2 would end up in the Authorizer which means also would generate an
> audit event. It also needs a pre-agreed constant for such cases that can be
> differentiated from normal custom location based authorizations.
> If the Option-2 is chosen:
> * The following policy syntax could be used for custom locations:
> {noformat}
> iceberg://mydatabase/mytable/snapshot=/my/custom/location/whatever/*
> {noformat}
> * While the pre-agreed default location constant based policy format could
> be:
> {noformat}
> iceberg://*/*/snapshot=default_location {noformat}
>
> There could be even a new property introduced to decide if the Authorization
> for default locations should be skipped at-all, or not (and use the e.g.
> snapshot=default_location constant). This way everyone can decide whether
> audit events or the performance w/o the authorization step are preferred.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)