guilherme0170 opened a new issue #9844:
URL: https://github.com/apache/incubator-superset/issues/9844


   I'm setting up Superset (0.36.0) in production Mode (with Gunicorn), and I 
would like to set up impersonate while running Impala queries on my Kerberized 
Cluster, to each user of Superset have privilegies on tables/databases like he 
has on Hive/Hue/HDFS. I've tried to set "Impersonate the logged on user" to 
true in my database config, but it's not changing the user that is running the 
query, it's always using the celery-worker user.
   
   My database config is:
   
   [![Top Database Config][1]][1]
   [![End of Database Config][2]][2]
   
   ```
   Extra:
   
       {
           "metadata_params": {},
           "engine_params":  {
                "connect_args": {  
                           "port": 21050,
                           "use_ssl": "True", 
                           "ca_cert": "/path/to/my/cert.pem",
                           "auth_mechanism": "GSSAPI"
                 }
            },
           "metadata_cache_timeout": {},
           "schemas_allowed_for_csv_upload": []
       }
   ```
   
   My query resume in Cloudera Manager (5.13):
   
   [![Query in CM][4]][4]
   
   My Celery-Worker connection log:
   
   ```
   [2020-05-19 15:22:27,316: INFO/MainProcess] Received task: 
sql_lab.get_sql_results[2fa056d5-7e49-4794-adf9-8edcbae9195a]
   [2020-05-19 15:22:27,327: DEBUG/MainProcess] TaskPool: Apply <function 
_fast_trace_task at 0x7fdc57e248c8> (args:('sql_lab.get_sql_results', 
'2fa056d5-7e49-4794-adf9-8edcbae9195a', {'lang': 'py', 'task': 
'sql_lab.get_sql_results', 'id': '2fa056d5-7e49-4794-adf9-8edcbae9195a', 
'shadow': None, 'eta': None, 'expires': None, 'group': None, 'retries': 0, 
'timelimit': [21660, 21600], 'root_id': '2fa056d5-7e49-4794-adf9-8edcbae9195a', 
'parent_id': None, 'argsrepr': "(240, '-- Note: Unless you save your query, 
these tabs will NOT persist if you clear your cookies or change 
browsers.\nSELECT * FROM test_data.covid19 WHERE total_death >= 1000 && time = 
\\'2020-05-05\\';')", 'kwargsrepr': "{'return_results': False, 'store_results': 
True, 'user_name': 'logged_user', 'start_time': 1589901747190.4321, 
'expand_data': False, 'log_params': {'user_agent': 'Mozilla/5.0 (Windows NT 
6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 
Safari/537.36'}}", 'origin': 'gen32370@my_ip', 'reply_to': 
'9cb3eea9-e810-3400-bc4d-5f409e813fd3', 'correlation_id': 
'2fa056d5-7e49-4794-adf9-8edcbae9195a',... kwargs:{})
   [2020-05-19 15:22:27,332: DEBUG/MainProcess] Task accepted: 
sql_lab.get_sql_results[2fa056d5-7e49-4794-adf9-8edcbae9195a] pid:15409
   [2020-05-19 15:22:27,425: INFO/MainProcess] Query 240: Executing 1 
statement(s)
   [2020-05-19 15:22:27,425: INFO/MainProcess] Query 240: Set query to 'running'
   [2020-05-19 15:22:27,485: DEBUG/MainProcess] Connecting to HiveServer2 
my_host:21050 with GSSAPI authentication mechanism
   [2020-05-19 15:22:27,485: DEBUG/MainProcess] get_socket: host=my_host 
port=21050 use_ssl=True ca_cert=/path/to/my/cert.pem
   [2020-05-19 15:22:27,509: DEBUG/MainProcess] 
sock=<thriftpy2.transport.sslsocket.TSSLSocket object at 0x7fdc4f021320>
   [2020-05-19 15:22:27,510: DEBUG/MainProcess] get_transport: 
socket=<thriftpy2.transport.sslsocket.TSSLSocket object at 0x7fdc4f021320> 
host=my_host kerberos_service_name=impala auth_mechanism=GSSAPI 
user=logged_user password=fuggetaboutit
   [2020-05-19 15:22:27,543: DEBUG/MainProcess] 
transport=<thrift_sasl.TSaslClientTransport object at 0x7fdc4f051128> 
protocol=<thriftpy2.protocol.binary.TBinaryProtocol object at 0x7fdc4f01e6d8> 
service=<thriftpy2.thrift.TClient object at 0x7fdc4f01e0f0>
   [2020-05-19 15:22:27,544: DEBUG/MainProcess] 
HiveServer2Connection(service=<impala.hiveserver2.HS2Service object at 
0x7fdc4f01e5f8>, default_db=None)
   [2020-05-19 15:22:27,545: DEBUG/MainProcess] Getting a cursor (Impala 
session)
   [2020-05-19 15:22:27,545: DEBUG/MainProcess] .cursor(): getting new 
session_handle
   [2020-05-19 15:22:27,545: DEBUG/MainProcess] OpenSession: 
req=TOpenSessionReq(client_protocol=5, username='celery-worker', password=None, 
configuration=None)
   [2020-05-19 15:22:27,546: DEBUG/MainProcess] Attempting to open transport 
(tries_left=3)
   [2020-05-19 15:22:27,546: DEBUG/MainProcess] Transport opened
   ```
   PS: It always use te user from active keytab, I've tryed to 'kinit' a keytab 
from other user and it runned the query by the user from the new keytab.
   
   How can I enable Impersonate correctly in my Superset? Maybe there is 
something related to the config `impala.doas.user` in HiveServer2 connection, 
but I don't know how to do this properly or if this is possible in Superset.
   
   
     [1]: https://i.stack.imgur.com/fseo0.png
     [2]: https://i.stack.imgur.com/0aMvr.png
     [3]: https://i.stack.imgur.com/IdIjb.png
     [4]: https://i.stack.imgur.com/tZz3Z.png


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to