Lars Francke created HBASE-20797:
------------------------------------
Summary: hbase-spark
Key: HBASE-20797
URL: https://issues.apache.org/jira/browse/HBASE-20797
Project: HBase
Issue Type: Bug
Components: spark
Affects Versions: 3.0.0
Reporter: Lars Francke
We're running into an issue using the spark integration when using Hadoop
2.7.2. The problem is this line of code from {{HBaseContext.scala}}
{code:java}
ugi.setAuthenticationMethod(AuthenticationMethod.PROXY)
{code}
I'm not an expert but I think that's wrong code. If we were to create a Proxy
user then we'd need to use {{UserGroupInformation.createProxyUser(...) }} which
would also set the realUser etc. Also: I don't think it makes sense to create a
proxy user on the client side? The chances are good that the user we're
authenticating as doesn't exen have proxy privileges as it's usually only
granted to servers.
We've tried to trace where this line of code came from in Git but it was a code
drop back in Ted's original repo.
The error we're seeing actually occurs when (in a Spark job) we access HDFS
because KMSClientProvider has code like this:
{code:java}
actualUgi =
(UserGroupInformation.getCurrentUser().getAuthenticationMethod() ==
UserGroupInformation.AuthenticationMethod.PROXY) ? UserGroupInformation
.getCurrentUser().getRealUser() : UserGroupInformation
{code}
But we've never set up the realUser so actualUgi is null which later leads to a
NullPointerException.
I _think_ the proper fix is to just remove that line as I have no idea what its
intention is. I can provide a patch but I'd like to get input first. Maybe I'm
mistaken?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)