Hi all,
*
*
*Problem statement - *
We have a oozie workflow which uses orders HBase tables. So we have defined
Hive External Views over these HBase tables. Hive action is invoked through
oozie. All this works on a kerebros secured cluster.
Now the workflow starts & Hive starts the query execution but the auth
fails during the hand-off between Hive & HBase. Here is the stacktrace I
get -
*stderr logs*
MethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.java:372)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1400)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
at
org.apache.hadoop.hbase.client.ServerCallable.translateException(ServerCallable.java:228)
at
org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:166)
at
org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
at $Proxy25.getAuthenticationToken(Unknown Source)
at
org.apache.hadoop.hbase.security.token.TokenUtil.obtainToken(TokenUtil.java:54)
at
org.apache.hadoop.hbase.security.token.TokenUtil$3.run(TokenUtil.java:161)
at
org.apache.hadoop.hbase.security.token.TokenUtil$3.run(TokenUtil.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at
org.apache.hadoop.hbase.security.token.TokenUtil.obtainTokenForJob(TokenUtil.java:158)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.util.Methods.call(Methods.java:37)
at
org.apache.hadoop.hbase.security.User$SecureHadoopUser.obtainAuthTokenForJob(User.java:486)
at
org.apache.hadoop.hbase.mapred.TableMapReduceUtil.initCredentials(TableMapReduceUtil.java:174)
at
org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:419)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:292)
at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:292)
at
org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1091)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1083)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:993)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:946)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:946)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:920)
at
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)
at
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1352)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1138)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
at
org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:445)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:455)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:711)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:261)
at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:238)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:491)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.security.AccessDeniedException):
org.apache.hadoop.hbase.security.AccessDeniedException: Token
generation only allowed for Kerberos authenticated clients
at
org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87)
at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.regionserver.HRegion.exec(HRegion.java:5093)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.execCoprocessor(HRegionServer.java:3570)
at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.java:372)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1400)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1021)
at
org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:164)
at $Proxy22.execCoprocessor(Unknown Source)
at
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1.call(ExecRPCInvoker.java:75)
at
org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1.call(ExecRPCInvoker.java:73)
at
org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:163)
... 61 more
Job Submission failed with exception
'org.apache.hadoop.hbase.security.AccessDeniedException(org.apache.hadoop.hbase.security.AccessDeniedException:
Token generation only allowed for Kerberos authenticated clients
at
org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87)
at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.regionserver.HRegion.exec(HRegion.java:5093)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.execCoprocessor(HRegionServer.java:3570)
at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.java:372)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1400)
)'
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.MapRedTask
Intercepting System.exit(1)
Failing Oozie Launcher, Main class
[org.apache.oozie.action.hadoop.HiveMain], exit code [1]
*
*
*syslog logs*
2013-10-24 18:07:16,119 WARN mapreduce.Counters: Group
org.apache.hadoop.mapred.Task$Counter is deprecated. Use
org.apache.hadoop.mapreduce.TaskCounter instead
2013-10-24 18:07:16,759 INFO org.apache.hadoop.mapred.TaskRunner:
Creating symlink:
/grid/4/mapred/local/taskTracker/distcache/2269867183338974952_-690318494_1786161511/h3nn.ch.flipkart.com/user/fk-reco/recoAutomation/orderReco/books-order.q
<-
/grid/8/mapred/local/taskTracker/fk-reco/jobcache/job_201310071850_63754/attempt_201310071850_63754_m_000000_0/work/books-order.q
2013-10-24 18:07:16,765 INFO org.apache.hadoop.mapred.TaskRunner:
Creating symlink:
/grid/2/mapred/local/taskTracker/distcache/-8392045893796673174_-262239802_1345198338/h3nn.ch.flipkart.com/user/fk-reco/oozie/share/hadoop-ant.pom
<-
/grid/8/mapred/local/taskTracker/fk-reco/jobcache/job_201310071850_63754/attempt_201310071850_63754_m_000000_0/work/hadoop-ant.pom
2013-10-24 18:07:16,769 INFO org.apache.hadoop.mapred.TaskRunner:
Creating symlink:
/grid/8/mapred/local/taskTracker/distcache/3421064608938800210_-858967610_412585130/h3nn.ch.flipkart.com/user/fk-reco/oozie/share/hive
<-
/grid/8/mapred/local/taskTracker/fk-reco/jobcache/job_201310071850_63754/attempt_201310071850_63754_m_000000_0/work/hive
2013-10-24 18:07:16,773 INFO org.apache.hadoop.mapred.TaskRunner:
Creating symlink:
/grid/7/mapred/local/taskTracker/distcache/9085893969845301951_1910424006_1345203775/h3nn.ch.flipkart.com/user/fk-reco/oozie/share/maven-scm-providers-standard.pom
<-
/grid/8/mapred/local/taskTracker/fk-reco/jobcache/job_201310071850_63754/attempt_201310071850_63754_m_000000_0/work/maven-scm-providers-standard.pom
2013-10-24 18:07:16,875 WARN org.apache.hadoop.conf.Configuration:
session.id is deprecated. Instead, use dfs.metrics.session-id
2013-10-24 18:07:16,876 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=MAP, sessionId=
2013-10-24 18:07:16,902 WARN org.apache.hadoop.conf.Configuration:
slave.host.name is deprecated. Instead, use dfs.datanode.hostname
2013-10-24 18:07:17,564 INFO org.apache.hadoop.util.ProcessTree:
setsid exited with exit code 0
2013-10-24 18:07:17,593 INFO org.apache.hadoop.mapred.Task: Using
ResourceCalculatorPlugin :
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@7d56b386
2013-10-24 18:07:17,997 ERROR
org.apache.hadoop.security.UserGroupInformation:
PriviledgedActionException as:fk-reco (*auth:SIMPLE*)
cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category READ is not supported in state standby
2013-10-24 18:07:17,999 WARN org.apache.hadoop.ipc.Client: Exception
encountered while connecting to the server :
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category READ is not supported in state standby
2013-10-24 18:07:17,999 ERROR
org.apache.hadoop.security.UserGroupInformation:
PriviledgedActionException as:fk-reco (auth:SIMPLE)
cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category READ is not supported in state standby
2013-10-24 18:07:18,341 INFO org.apache.hadoop.mapred.MapTask:
Processing split:
hdfs://h3nn.ch.flipkart.com:8020/user/fk-reco/oozie-oozi/0000003-131024173744969-oozie-oozi-W/fetch-order-details--hive/input/dummy.txt:0+5
2013-10-24 18:07:18,394 WARN mapreduce.Counters: Counter name
MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group
name and BYTES_READ as counter name instead
2013-10-24 18:07:18,403 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0
*hive query that I run*
*
*
*
SELECT o.account_id, oi.order_id , oi.product_id
FROM orders o join order_items oi on (o.id = oi.order_id)
where o.account_id <> ''
and oi.category_id = 22
order by o.account_id, oi.order_id;
*
*Other Info -*
1. As you can see in *stderrlog & syslog * it says about auth
credentials fail. Also in syslog it says that the auth: SIMPLE. wonder why
that happens.
2. The Hive query is a join between 2 or more HBase tables. Its not a
single straightforward query.
3. So we have 3 levels (Oozie->Hive->HBase) of execution here.
4. We could write a Java process which does this JOIN while running as a
mapper. This would not scale. Basically since orders tables have a lot of
data its not possible to load all this data in every mapper.
5. Plus Caching is out of question since Orders data is ever changing &
we need to get all orders data everyday.
I have been trying to solve this for a few days now. Any help would be
appreciated.
Thanks,
Srikar