Hi, You need to use the <credentials> tag in Oozie workflow to pass HBase credentials, and include hive-site.xml with the <file> tag to be able to access Hive tables using the metastore_uri and principal.
Can you share your workflow definition to verify if it is missing the above tags? -- Mona On 10/24/13 8:57 PM, "Srikar Appalalraju (Tech - VS)" <[email protected]> wrote: >Hi all, > >* >* >*Problem statement - * >We have a oozie workflow which uses orders HBase tables. So we have >defined >Hive External Views over these HBase tables. Hive action is invoked >through >oozie. All this works on a kerebros secured cluster. > >Now the workflow starts & Hive starts the query execution but the auth >fails during the hand-off between Hive & HBase. Here is the stacktrace I >get - > >*stderr logs* > >MethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at >org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.ja >va:372) > at >org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1400) > > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at >sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAcc >essorImpl.java:39) > at >sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstr >uctorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at >org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException >.java:90) > at >org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteExceptio >n.java:79) > at >org.apache.hadoop.hbase.client.ServerCallable.translateException(ServerCal >lable.java:228) > at >org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.j >ava:166) > at >org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79) > at $Proxy25.getAuthenticationToken(Unknown Source) > at >org.apache.hadoop.hbase.security.token.TokenUtil.obtainToken(TokenUtil.jav >a:54) > at >org.apache.hadoop.hbase.security.token.TokenUtil$3.run(TokenUtil.java:161) > at >org.apache.hadoop.hbase.security.token.TokenUtil$3.run(TokenUtil.java:159) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at >org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation. >java:1408) > at >org.apache.hadoop.hbase.security.token.TokenUtil.obtainTokenForJob(TokenUt >il.java:158) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: >39) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.util.Methods.call(Methods.java:37) > at >org.apache.hadoop.hbase.security.User$SecureHadoopUser.obtainAuthTokenForJ >ob(User.java:486) > at >org.apache.hadoop.hbase.mapred.TableMapReduceUtil.initCredentials(TableMap >ReduceUtil.java:174) > at >org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBase >TableInputFormat.java:419) > at >org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.jav >a:292) > at >org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveI >nputFormat.java:292) > at > org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1091) > at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1083) > at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:993) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:946) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at >org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation. >java:1408) > at >org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:946) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:920) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447) > at > org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) > at >org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57 >) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1352) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1138) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951) > at >org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347) > at > org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:445) > at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:455) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:711) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613) > at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:261) > at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:238) > at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37) > at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:49) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: >39) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at >org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:491) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333) > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at >org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation. >java:1408) > at org.apache.hadoop.mapred.Child.main(Child.java:262) >Caused by: >org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.security.Acc >essDeniedException): >org.apache.hadoop.hbase.security.AccessDeniedException: Token >generation only allowed for Kerberos authenticated clients > at >org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToke >n(TokenProvider.java:87) > at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.regionserver.HRegion.exec(HRegion.java:5093) > at >org.apache.hadoop.hbase.regionserver.HRegionServer.execCoprocessor(HRegion >Server.java:3570) > at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at >org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.ja >va:372) > at >org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1400) > > at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1021) > at >org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine >.java:164) > at $Proxy22.execCoprocessor(Unknown Source) > at >org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1.call(ExecRPCInvoker.java:75) > at >org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1.call(ExecRPCInvoker.java:73) > at >org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.j >ava:163) > ... 61 more >Job Submission failed with exception >'org.apache.hadoop.hbase.security.AccessDeniedException(org.apache.hadoop. >hbase.security.AccessDeniedException: >Token generation only allowed for Kerberos authenticated clients > at >org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToke >n(TokenProvider.java:87) > at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.regionserver.HRegion.exec(HRegion.java:5093) > at >org.apache.hadoop.hbase.regionserver.HRegionServer.execCoprocessor(HRegion >Server.java:3570) > at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at >org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.ja >va:372) > at >org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1400) >)' >FAILED: Execution Error, return code 1 from >org.apache.hadoop.hive.ql.exec.MapRedTask >Intercepting System.exit(1) >Failing Oozie Launcher, Main class >[org.apache.oozie.action.hadoop.HiveMain], exit code [1] > >* >* > >*syslog logs* > >2013-10-24 18:07:16,119 WARN mapreduce.Counters: Group >org.apache.hadoop.mapred.Task$Counter is deprecated. Use >org.apache.hadoop.mapreduce.TaskCounter instead >2013-10-24 18:07:16,759 INFO org.apache.hadoop.mapred.TaskRunner: >Creating symlink: >/grid/4/mapred/local/taskTracker/distcache/2269867183338974952_-690318494_ >1786161511/h3nn.ch.flipkart.com/user/fk-reco/recoAutomation/orderReco/book >s-order.q ><- >/grid/8/mapred/local/taskTracker/fk-reco/jobcache/job_201310071850_63754/a >ttempt_201310071850_63754_m_000000_0/work/books-order.q >2013-10-24 18:07:16,765 INFO org.apache.hadoop.mapred.TaskRunner: >Creating symlink: >/grid/2/mapred/local/taskTracker/distcache/-8392045893796673174_-262239802 >_1345198338/h3nn.ch.flipkart.com/user/fk-reco/oozie/share/hadoop-ant.pom ><- >/grid/8/mapred/local/taskTracker/fk-reco/jobcache/job_201310071850_63754/a >ttempt_201310071850_63754_m_000000_0/work/hadoop-ant.pom >2013-10-24 18:07:16,769 INFO org.apache.hadoop.mapred.TaskRunner: >Creating symlink: >/grid/8/mapred/local/taskTracker/distcache/3421064608938800210_-858967610_ >412585130/h3nn.ch.flipkart.com/user/fk-reco/oozie/share/hive ><- >/grid/8/mapred/local/taskTracker/fk-reco/jobcache/job_201310071850_63754/a >ttempt_201310071850_63754_m_000000_0/work/hive >2013-10-24 18:07:16,773 INFO org.apache.hadoop.mapred.TaskRunner: >Creating symlink: >/grid/7/mapred/local/taskTracker/distcache/9085893969845301951_1910424006_ >1345203775/h3nn.ch.flipkart.com/user/fk-reco/oozie/share/maven-scm-provide >rs-standard.pom ><- >/grid/8/mapred/local/taskTracker/fk-reco/jobcache/job_201310071850_63754/a >ttempt_201310071850_63754_m_000000_0/work/maven-scm-providers-standard.pom >2013-10-24 18:07:16,875 WARN org.apache.hadoop.conf.Configuration: >session.id is deprecated. Instead, use dfs.metrics.session-id >2013-10-24 18:07:16,876 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: >Initializing JVM Metrics with processName=MAP, sessionId= >2013-10-24 18:07:16,902 WARN org.apache.hadoop.conf.Configuration: >slave.host.name is deprecated. Instead, use dfs.datanode.hostname >2013-10-24 18:07:17,564 INFO org.apache.hadoop.util.ProcessTree: >setsid exited with exit code 0 >2013-10-24 18:07:17,593 INFO org.apache.hadoop.mapred.Task: Using >ResourceCalculatorPlugin : >org.apache.hadoop.util.LinuxResourceCalculatorPlugin@7d56b386 >2013-10-24 18:07:17,997 ERROR >org.apache.hadoop.security.UserGroupInformation: >PriviledgedActionException as:fk-reco (*auth:SIMPLE*) >cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyE >xception): >Operation category READ is not supported in state standby >2013-10-24 18:07:17,999 WARN org.apache.hadoop.ipc.Client: Exception >encountered while connecting to the server : >org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyExcepti >on): >Operation category READ is not supported in state standby >2013-10-24 18:07:17,999 ERROR >org.apache.hadoop.security.UserGroupInformation: >PriviledgedActionException as:fk-reco (auth:SIMPLE) >cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyE >xception): >Operation category READ is not supported in state standby >2013-10-24 18:07:18,341 INFO org.apache.hadoop.mapred.MapTask: >Processing split: >hdfs://h3nn.ch.flipkart.com:8020/user/fk-reco/oozie-oozi/0000003-131024173 >744969-oozie-oozi-W/fetch-order-details--hive/input/dummy.txt:0+5 >2013-10-24 18:07:18,394 WARN mapreduce.Counters: Counter name >MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group >name and BYTES_READ as counter name instead >2013-10-24 18:07:18,403 INFO org.apache.hadoop.mapred.MapTask: >numReduceTasks: 0 > > >*hive query that I run* >* >* >* >SELECT o.account_id, oi.order_id , oi.product_id >FROM orders o join order_items oi on (o.id = oi.order_id) >where o.account_id <> '' >and oi.category_id = 22 >order by o.account_id, oi.order_id; >* > > >*Other Info -* > > 1. As you can see in *stderrlog & syslog * it says about auth > credentials fail. Also in syslog it says that the auth: SIMPLE. wonder >why > that happens. > 2. The Hive query is a join between 2 or more HBase tables. Its not a > single straightforward query. > 3. So we have 3 levels (Oozie->Hive->HBase) of execution here. > 4. We could write a Java process which does this JOIN while running as >a > mapper. This would not scale. Basically since orders tables have a >lot of > data its not possible to load all this data in every mapper. > 5. Plus Caching is out of question since Orders data is ever changing & > we need to get all orders data everyday. > >I have been trying to solve this for a few days now. Any help would be >appreciated. > >Thanks, >Srikar
