morristm opened a new issue #333:
URL: https://github.com/apache/submarine/issues/333
I built the submarine security plugin for ranger 1.2 and spark 2.3. When
I'm using pyspark and spark-submit it is able to retrieve the hive policy file
from ranger and store it to the file system. However, when I'm using Jupyter
and livy to connect to an HDP 3.1.5 cluster it does not succeed in getting the
policy file. Instead I get this error
WARN RangerAdminRESTClient: Error getting policies. secureMode=true,
user=xxxxxxx (auth:SIMPLE), response={"httpStatusCode":401,"statusCode":0},
serviceName=XXXX_hive
My cluster is kerberized and when livy starts a yarn application it uses a
proxy id to do it. So I believe there is some level of impersonation
occurring. In the yarn application there is no tgt created for the userid that
is running the application. Incidentally when I run in the scenario above,
where I'm using a pyspark shell or spark-submit I do have a tgt for the userid.
I believe that because I have a tgt in the scenario that works, I am then able
to authenticate to ranger and successfully pull the policy. However in the
scenario where the tgt does not exist I'm not able to authenticate to Ranger
and it gives a 401. So my question is, how is ranger admin called to pull the
policy, and what credentials are used to log in? Can you point me to the code
that does the authentication? Have any ideas on how to fix this?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]