[
https://issues.apache.org/jira/browse/HIVE-21748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16854517#comment-16854517
]
Aron Hamvas commented on HIVE-21748:
------------------------------------
The issue is not in HBaseStorageHandler but in MapredLocalTask and
SecureCmdDoAs classes.
When a local task is executed in a child JVM (i.e., a new process) the tokens
are collected and written to a storage file like this. Basically, it is
hardwired to only pick up DT from the FS.
{code:java}
public SecureCmdDoAs(HiveConf conf) throws HiveException, IOException{
// Get delegation token for user from filesystem and write the token along
with
// metastore tokens into a file
String uname = UserGroupInformation.getLoginUser().getShortUserName();
FileSystem fs = FileSystem.get(conf);
Credentials cred = new Credentials();
// Use method addDelegationTokens instead of getDelegationToken to get all
the tokens including KMS.
fs.addDelegationTokens(uname, cred);
tokenFile = File.createTempFile("hive_hadoop_delegation_token", null);
tokenPath = new Path(tokenFile.toURI());
//write credential with token to file
cred.writeTokenStorageFile(tokenPath, conf);
}
{code}
If we turn off 'hive.exec.submit.local.task.via.child', then the problem
disappears, but of course there is a higher chance of running into
out-of-memory issues.
> HBase Operations Can Fail When Using MAPREDLOCAL
> ------------------------------------------------
>
> Key: HIVE-21748
> URL: https://issues.apache.org/jira/browse/HIVE-21748
> Project: Hive
> Issue Type: Bug
> Components: HBase Handler
> Affects Versions: 4.0.0, 3.2.0
> Reporter: David Mollitor
> Priority: Major
> Attachments: HBaseMapredLocalExplain.txt
>
>
> https://github.com/apache/hive/blob/5634140b2beacdac20ceec8c73ff36bce5675ef8/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java#L258-L262
> {code:java|title=HBaseStorageHandler.java}
> if (this.configureInputJobProps) {
> LOG.info("Configuring input job properties");
> ...
> try {
> addHBaseDelegationToken(jobConf);
> } catch (IOException | MetaException e) {
> throw new IllegalStateException("Error while configuring input job
> properties", e);
> }
> }
> else {
> LOG.info("Configuring output job properties");
> ...
> }
> {code}
> What we can see here is that the HBase Delegation Token is only created when
> there is an input job (reading from an HBase source). For a particular stage
> of a query, if there is no HBASE input, only HBASE output, then the
> delegation token is not created and may cause a failure if the subsequent
> MapReduceLocal is unable to connect to HBase without the token.
> {code:none|title=Error Message in HS2 Log}
> 2019-05-17 10:24:55,036 ERROR
> org.apache.hive.service.cli.operation.Operation:
> [HiveServer2-Background-Pool: Thread-388]: Error running hive query:
> org.apache.hive.service.cli.HiveSQLException: Error while processing
> statement: FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> at
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
> at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238)
> at
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
> at
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
> at
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> You can tell it will fail because an HDFS Token will be created, but it will
> not report an HBASE token in the HS2 logs. The following is an example of a
> proper setup. If it is missing the HBASE_AUTH_TOKEN it will probably fail..
> {code:none|title=Logging of a Proper Run}
> 2019-05-17 10:36:15,593 INFO org.apache.hadoop.mapreduce.JobSubmitter:
> [HiveServer2-Background-Pool: Thread-455]: Submitting tokens for job:
> job_1557858663665_0048
> 2019-05-17 10:36:15,593 INFO org.apache.hadoop.mapreduce.JobSubmitter:
> [HiveServer2-Background-Pool: Thread-455]: Kind: HDFS_DELEGATION_TOKEN,
> Service: 10.17.101.237:8020, Ident: (token for hive: HDFS_DELEGATION_TOKEN
> owner=hive/[email protected], renewer=yarn, realUser=,
> issueDate=1558114574357, maxDate=1558719374357, sequenceNumber=75,
> masterKeyId=4)
> 2019-05-17 10:36:15,593 INFO org.apache.hadoop.mapreduce.JobSubmitter:
> [HiveServer2-Background-Pool: Thread-455]: Kind: HBASE_AUTH_TOKEN, Service:
> 9b282733-7927-4785-92ea-dad419f6f055, Ident:
> (org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@b1)
> 2019-05-17 10:36:15,859 INFO
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl:
> [HiveServer2-Background-Pool: Thread-455]: Submitted application
> application_1557858663665_0048
> {code}
> Error message in the Local MapReduce log.
> {code:none|title=Error message}
> Caused by:
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.security.AccessDeniedException):
> org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient
> permissions for user 'xxxx' (table=xyz, action=READ)
> at
> org.apache.hadoop.hbase.security.access.AccessController.internalPreRead(AccessController.java:1611)
> at
> org.apache.hadoop.hbase.security.access.AccessController.preScannerOpen(AccessController.java:2080)
> at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$50.call(RegionCoprocessorHost.java:1300)
> at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHos
> or perhaps...
> 2019-05-10 07:43:24,875 WARN [htable-pool2-t1]:
> security.UserGroupInformation (UserGroupInformation.java:doAs(1927)) -
> PriviledgedActionException as:hive (auth:KERBEROS)
> cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to find
> any Kerberos tgt)]
> 2019-05-10 07:43:24,876 WARN [htable-pool2-t1]: ipc.RpcClientImpl
> (RpcClientImpl.java:run(675)) - Exception encountered while connecting to the
> server : javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to find
> any Kerberos tgt)]
> 2019-05-10 07:43:24,876 ERROR [htable-pool2-t1]: ipc.RpcClientImpl
> (RpcClientImpl.java:run(685)) - SASL authentication failed. The most likely
> cause is missing or invalid credentials. Consider 'kinit'.
> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to find
> any Kerberos tgt)]
> at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)