mayukhghoshme opened a new issue #4951: Connecting Superset to Hive with 
Kerberos failing
URL: https://github.com/apache/incubator-superset/issues/4951
 
 
   Make sure these boxes are checked before submitting your issue - thank you!
   
   - [ ] I have checked the superset logs for python stacktraces and included 
it here as text if any
   - [ ] I have reproduced the issue with at least the latest released version 
of superset
   - [ ] I have checked the issue tracker for the same issue and I haven't 
found one similar
   
   
   ### Superset version
   0.22.1
   
   I am trying to connect Superset to a Kerberised Hive cluster, however, that 
is failing with the Kerberos error. 
   
   `2018-05-08 01:59:37,051:ERROR:root:Could not start SASL: Error in 
sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS 
failure.  Minor code may provide more information (No Kerberos credentials 
available (default cache: /tmp/krb5cc_0))`
   `Traceback (most recent call last):`
   `  File "/usr/lib/python2.7/site-packages/superset/views/core.py", line 
1507, in testconn
       engine.connect()`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 
2091, in connect
       return self._connection_cls(self, **kwargs)`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 
90, in __init__
       if connection is not None else engine.raw_connection()`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 
2177, in raw_connection
       self.pool.unique_connection, _connection)`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 
2147, in _wrap_pool_connect
       return fn()`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 328, 
in unique_connection
       return _ConnectionFairy._checkout(self)`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 766, 
in _checkout
       fairy = _ConnectionRecord.checkout(pool)`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 516, 
in checkout
       rec = pool._do_get()`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 1138, 
in _do_get
       self._dec_overflow()`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", 
line 66, in __exit__
       compat.reraise(exc_type, exc_value, exc_tb)`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 1135, 
in _do_get
       return self._create_connection()`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 333, 
in _create_connection
       return _ConnectionRecord(self)`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 461, 
in __init__
       self.__connect(first_connect_check=True)`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 651, 
in __connect
       connection = pool._invoke_creator(self)`
   `  File 
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/strategies.py", line 105, 
in connect
       return dialect.connect(*cargs, **cparams)`
   `  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", 
line 393, in connect
       return self.dbapi.connect(*cargs, **cparams)`
   `  File "/usr/lib/python2.7/site-packages/pyhive/hive.py", line 64, in 
connect
       return Connection(*args, **kwargs)`
   `  File "/usr/lib/python2.7/site-packages/pyhive/hive.py", line 159, in 
__init__
       self._transport.open()`
   `  File "/usr/lib/python2.7/site-packages/thrift_sasl/__init__.py", line 79, 
in open
       message=("Could not start SASL: %s" % self.sasl.getError()))`
   `TTransportException: Could not start SASL: Error in sasl_client_start (-1) 
SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code 
may provide more information (No Kerberos credentials available (default cache: 
/tmp/krb5cc_0))`
   
   
   Below is the setup and things I have tried:
   
   1. Installed Superset following the docs in AWS EC2 instance.
   2. Started the Superset web server as root on port 80.
   3. Installed necessary Kerberos packages.
   4. Created user(let us call this X), got keytab, was able to do kinit.
   5. Creating a new data source from the UI with the below connection string 
fails:
   hive://xx.xx.xx.xx:10000/default?auth=KERBEROS&kerberos_service_name=hive
   6. Have tried with "Impersonate the Logged on user" and without it.
   
   I am able to connect to hive from the Python shell using the user X and 
SQLAlchemy:
   
   
   `import sqlalchemy`
    `engine = sqlalchemy.create_engine('hive://xx.xx.xx.xx:10000/default', 
connect_args={'auth': 
    'KERBEROS','kerberos_service_name': 'hive'})`
   ` c = engine.connect()`
    `result = c.execute('SELECT count(*) from my_schema.my_table')`
    `result.fetchall()`
   `[(5132,)]`
   
   By the looks of the error message, it seems to me that is trying to look for 
a Kerberos credential cache for the user root.  `No Kerberos credentials 
available (default cache: /tmp/krb5cc_0))`
   
   Am I missing something here? 
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to