mayukhghoshme commented on issue #4951: Connecting Superset to Hive with 
Kerberos failing
URL: 
https://github.com/apache/incubator-superset/issues/4951#issuecomment-387712454
 
 
   Okay, I figured it out. As I was mentioning earlier, it was trying to look 
for the Kerberos cache for the user 'root'. `No Kerberos credentials available 
(default cache: /tmp/krb5cc_0`. Got the hint from the file name since Kerberos 
cache tickets are usually appended with the gid of the user. In this case it is 
_0 which is for 'root'.
   
   So the resolution is that, 
   
   1.  All the below steps should be performed by a user who can authenticate 
itself against KDC and not some user like 'root'.
   
   
![image](https://user-images.githubusercontent.com/34889904/39811956-65ea65fa-53a8-11e8-9456-8e817e94feea.png)
   
   2. Using Impyla seemed to be more elegant than PyHive. Install `impyla` 
using pip as root.
   
   `pip install impyla`
   
   3. You need to have the keytab file for the user used in step1. Do a kinit 
using the same.
   
   4. Configure the connection string in Superset something like this.
   
   `SQLAlchemy URI = impala://<hive_host>:10000/default`
   
   ```Extra = {
       "metadata_params": {},
       "engine_params": {
                "connect_args": {
                    "auth_mechanism": "GSSAPI",
                    "kerberos_service_name":"hive"
                 }
   }
   }
   ```
   
   5. If something goes wrong regarding the packages, you can connect to the 
Superset server and use the Python shell to ensure that the modules are working 
fine, this way:
   
   ```
   from sqlalchemy import *
   engine = sqlalchemy.create_engine('impala://<hive_host>:10000/default', 
connect_args={'auth_mechanism': 'GSSAPI','kerberos_service_name': 'hive'})
   c = engine.connect()
   result = c.execute('SELECT count(*) from my_table')
   print result.fetchall()
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to