Thank you Wen, you made my day. I think this is exactly what I was talking about. I might just need to implement a proper credential provider for my use case.
One curiosity though, I see that you check for read/write permissions only. You never test if the credentials are able to delete objects which is important for drop privileges. Why so? Thanks, Marco On Tue, Jan 17, 2023, 9:49 PM Wen Shi via user <user@hive.apache.org> wrote: > Hi Marco, > > You can check this out: > > https://github.com/awslabs/amazon-emr-user-role-mapper/tree/master/emr-user-role-mapper-s3storagebasedauthorizationmanager > It is open sourced with AWS EMR utils named URM and we have been using it > for two years now. > > Thanks > > Wen > > On Tue, Jan 17, 2023 at 1:12 AM Marco Jacopo Ferrarotti < > marco.ferraro...@gmail.com> wrote: > >> Hi, >> >> I'm building an on prem data warehouse with a custom s3 gateway as >> storage backend. I was able to deploy a standalone Hive Metastore Server >> (HMS) secured by kerberos however now I'm having a hard time figuring out >> how to manage authorization. >> >> It seems to me that the storage based authorization layer is not >> compatible with s3a since hadoop reports just stub permissions for such >> "fs". On the other hand SQL Standards Based Authorization would force me >> to restrict everyone to access the data by means of hiveserver2 and this is >> not a viable solution for my use case. At least I would like to have a >> two-way access to the data/metadata: >> >> 1. using pySpark (mainly to develop ETL/ELT pipelines); >> 2. using a JDBC/ODBC connector (mainly to feed BI dashboards), for this I >> was considering the spark-thrift server but I'm open to hive2 as well; >> >> Am I missing something? Right now the only option I see would be to write >> a custom MetastoreAuthorizationProvider that checks s3a permissions either >> by querying the bucket ACLs or by performing test read/write/delete actions >> on the bucket. Has anyone tried to implement something similar? >> >> Thanks, >> Marco >> >