[ https://issues.apache.org/jira/browse/SPARK-24627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ABHISHEK KUMAR GUPTA updated SPARK-24627: ----------------------------------------- Environment: OS: SUSE11 Spark Version: 2.3.0 was: OS: SUSE11 Spark Version: 2.3.0 Hadoop: 2.8.3 > [Spark2.3.0] After HDFS Token expire kinit not able to submit job using > beeline > ------------------------------------------------------------------------------- > > Key: SPARK-24627 > URL: https://issues.apache.org/jira/browse/SPARK-24627 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.0 > Environment: OS: SUSE11 > Spark Version: 2.3.0 > > Reporter: ABHISHEK KUMAR GUPTA > Priority: Major > > Steps: > beeline session was active. > 1.Launch spark-beeline > 2. create table alt_s1 (time timestamp, name string, isright boolean, > datetoday date, num binary, height double, score float, decimaler > decimal(10,0), id tinyint, age int, license bigint, length smallint) row > format delimited fields terminated by ','; > 3. load data local inpath '/opt/typeddata60.txt' into table alt_s1; > 4. show tables;( Table listed successfully ) > 5. select * from alt_s1; > Throws HDFS_DELEGATION_TOKEN Exception > 0: jdbc:hive2://10.18.18.214:23040/default> select * from alt_s1; > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 1 in stage 22.0 failed 4 times, most recent failure: Lost task 1.3 in > stage 22.0 (TID 106, blr1000023110, executor 1): > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 7 for spark) can't be found in cache > at org.apache.hadoop.ipc.Client.call(Client.java:1475) > at org.apache.hadoop.ipc.Client.call(Client.java:1412) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:255) > at sun.reflect.GeneratedMethodAccessor44.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy16.getBlockLocations(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1226) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201) > at > org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:306) > at > org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:272) > at > org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:264) > at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1526) > at > org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:304) > at > org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:312) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769) > at > org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:109) > at > org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) > at > org.apache.spark.rdd.HadoopRDD$$anon$1.liftedTree1$1(HadoopRDD.scala:257) > at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:256) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:214) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:94) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) > **Note: Even after kinit spark/hadoop token is not getting renewed.** > Now Launch spark sql session ( Select * from alt_s1 ) is successful. > 1. Launch spark-sql > 2.spark-sql> select * from alt_s1; > 2018-06-22 14:24:04 INFO HiveMetaStore:746 - 0: get_table : db=test_one > tbl=alt_s1 > 2018-06-22 14:24:04 INFO audit:371 - ugi=spark/had...@hadoop.com > ip=unknown-ip-addr cmd=get_table : db=test_one tbl=alt_s1 > 2018-06-22 14:24:04 INFO SQLStdHiveAccessController:95 - Created > SQLStdHiveAccessController for session context : HiveAuthzSessionContext > [sessionString=2cf6aac4-91c6-4c2d-871b-4d7620d91f43, clientType=HIVECLI] > 2018-06-22 14:24:04 INFO metastore:291 - Mestastore configuration > hive.metastore.filter.hook changed from > org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to > org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook > 2018-06-22 14:24:04 INFO HiveMetaStore:746 - 0: Shutting down the object > store... > 2018-06-22 14:24:04 INFO audit:371 - ugi=spark/had...@hadoop.com > ip=unknown-ip-addr cmd=Shutting down the object store... > 2018-06-22 14:24:04 INFO HiveMetaStore:746 - 0: Metastore shutdown complete. > 2018-06-22 14:24:04 INFO audit:371 - ugi=spark/had...@hadoop.com > ip=unknown-ip-addr cmd=Metastore shutdown complete. > 2018-06-22 14:24:04 INFO MemoryStore:54 - Block broadcast_0 stored as values > in memory (estimated size 437.9 KB, free 365.9 MB) > 2018-06-22 14:24:04 INFO MemoryStore:54 - Block broadcast_0_piece0 stored as > bytes in memory (estimated size 33.6 KB, free 365.8 MB) > 2018-06-22 14:24:04 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in > memory on blr1000023111:45858 (size: 33.6 KB, free: 366.3 MB) > 2018-06-22 14:24:04 INFO SparkContext:54 - Created broadcast 0 from > 2018-06-22 14:24:05 INFO DFSClient:1033 - Created HDFS_DELEGATION_TOKEN > token 42 for spark on ha-hdfs:hacluster > 2018-06-22 14:24:05 INFO TokenCache:125 - Got dt for hdfs://hacluster; Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hacluster, Ident: > (HDFS_DELEGATION_TOKEN token 42 for spark) > 2018-06-22 14:24:05 INFO FileInputFormat:249 - Total input paths to process > : 1 > 2018-06-22 14:24:05 INFO SparkContext:54 - Starting job: processCmd at > CliDriver.java:376 > 2018-06-22 14:24:05 INFO DAGScheduler:54 - Got job 0 (processCmd at > CliDriver.java:376) with 1 output partitions > 2018-06-22 14:24:05 INFO DAGScheduler:54 - Final stage: ResultStage 0 > (processCmd at CliDriver.java:376) > 2018-06-22 14:24:05 INFO DAGScheduler:54 - Parents of final stage: List() > 2018-06-22 14:24:05 INFO DAGScheduler:54 - Missing parents: List() > 2018-06-22 14:24:05 INFO DAGScheduler:54 - Submitting ResultStage 0 > (MapPartitionsRDD[4] at processCmd at CliDriver.java:376), which has no > missing parents > 2018-06-22 14:24:05 INFO MemoryStore:54 - Block broadcast_1 stored as values > in memory (estimated size 10.6 KB, free 365.8 MB) > 2018-06-22 14:24:05 INFO MemoryStore:54 - Block broadcast_1_piece0 stored as > bytes in memory (estimated size 5.3 KB, free 365.8 MB) > 2018-06-22 14:24:05 INFO BlockManagerInfo:54 - Added broadcast_1_piece0 in > memory on blr1000023111:45858 (size: 5.3 KB, free: 366.3 MB) > 2018-06-22 14:24:05 INFO SparkContext:54 - Created broadcast 1 from > broadcast at DAGScheduler.scala:1039 > 2018-06-22 14:24:05 INFO ContextCleaner:54 - Cleaned accumulator 0 > 2018-06-22 14:24:05 INFO DAGScheduler:54 - Submitting 1 missing tasks from > ResultStage 0 (MapPartitionsRDD[4] at processCmd at CliDriver.java:376) > (first 15 tasks are for partitions Vector(0)) > 2018-06-22 14:24:05 INFO TaskSchedulerImpl:54 - Adding task set 0.0 with 1 > tasks > 2018-06-22 14:24:05 INFO TaskSetManager:54 - Starting task 0.0 in stage 0.0 > (TID 0, localhost, executor driver, partition 0, ANY, 7923 bytes) > 2018-06-22 14:24:05 INFO Executor:54 - Running task 0.0 in stage 0.0 (TID 0) > 2018-06-22 14:24:05 INFO HadoopRDD:54 - Input split: > hdfs://hacluster/user/sparkhive/warehouse/test_one.db/alt_s1/typeddata60.txt:0+4802 > 2018-06-22 14:24:05 INFO CodeGenerator:54 - Code generated in 232.938395 ms > 2018-06-22 14:24:06 INFO Executor:54 - Finished task 0.0 in stage 0.0 (TID > 0). 2221 bytes result sent to driver > 2018-06-22 14:24:06 INFO TaskSetManager:54 - Finished task 0.0 in stage 0.0 > (TID 0) in 660 ms on localhost (executor driver) (1/1) > 2018-06-22 14:24:06 INFO TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose > tasks have all completed, from pool > 2018-06-22 14:24:06 INFO DAGScheduler:54 - ResultStage 0 (processCmd at > CliDriver.java:376) finished in 0.769 s > 2018-06-22 14:24:06 INFO DAGScheduler:54 - Job 0 finished: processCmd at > CliDriver.java:376, took 0.829488 s > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 1 > 2015-11-13 12:10:22 jaie true 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 2 > 2014-09-11 22:10:22 suss false 2013-01-21 8 2200.0 19.0 > 232344559 7 777 777777 3 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 4 > 2015-11-13 12:10:22 jaie false 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 5 > 2014-09-11 22:10:22 suss false 2003-01-21 8 2200.0 19.0 > 24444559 1 111 11111 6 > 2014-09-11 22:10:22 suss false 2013-01-21 8 2200.0 19.0 > 232344559 7 777 777777 7 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 8 > 2015-11-13 12:10:22 jaie false 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 9 > 2014-09-11 22:10:22 suss false 2003-01-21 8 2200.0 19.0 > 24444559 1 111 11111 10 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 11 > 2015-11-13 12:10:22 jaie true 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 12 > 2014-09-11 22:10:22 suss false 2013-01-21 8 2200.0 19.0 > 232344559 7 777 777777 13 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 14 > 2015-11-13 12:10:22 jaie false 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 15 > 2014-09-11 22:10:22 suss false 2003-01-21 8 2200.0 19.0 > 24444559 1 111 11111 16 > 2014-09-11 22:10:22 suss false 2013-01-21 8 2200.0 19.0 > 232344559 7 777 777777 17 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 18 > 2015-11-13 12:10:22 jaie false 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 19 > 2014-09-11 22:10:22 suss false 2003-01-21 8 2200.0 19.0 > 24444559 1 111 11111 20 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 21 > 2015-11-13 12:10:22 jaie true 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 22 > 2014-09-11 22:10:22 suss false 2013-01-21 8 2200.0 19.0 > 232344559 7 777 777777 23 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 24 > 2015-11-13 12:10:22 jaie false 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 25 > 2014-09-11 22:10:22 suss false 2003-01-21 8 2200.0 19.0 > 24444559 1 111 11111 26 > 2014-09-11 22:10:22 suss false 2013-01-21 8 2200.0 19.0 > 232344559 7 777 777777 27 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 28 > 2015-11-13 12:10:22 jaie false 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 29 > 2014-09-11 22:10:22 suss false 2003-01-21 8 2200.0 19.0 > 24444559 1 111 11111 30 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 31 > 2015-11-13 12:10:22 jaie true 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 32 > 2014-09-11 22:10:22 suss false 2013-01-21 8 2200.0 19.0 > 232344559 7 777 777777 33 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 34 > 2015-11-13 12:10:22 jaie false 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 35 > 2014-09-11 22:10:22 suss false 2003-01-21 8 2200.0 19.0 > 24444559 1 111 11111 36 > 2014-09-11 22:10:22 suss false 2013-01-21 8 2200.0 19.0 > 232344559 7 777 777777 37 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 38 > 2015-11-13 12:10:22 jaie false 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 39 > 2014-09-11 22:10:22 suss false 2003-01-21 8 2200.0 19.0 > 24444559 1 111 11111 40 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 41 > 2015-11-13 12:10:22 jaie true 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 42 > 2014-09-11 22:10:22 suss false 2013-01-21 8 2200.0 19.0 > 232344559 7 777 777777 43 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 44 > 2015-11-13 12:10:22 jaie false 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 45 > 2014-09-11 22:10:22 suss false 2003-01-21 8 2200.0 19.0 > 24444559 1 111 11111 46 > 2014-09-11 22:10:22 suss false 2013-01-21 8 2200.0 19.0 > 232344559 7 777 777777 47 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 48 > 2015-11-13 12:10:22 jaie false 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 49 > 2014-09-11 22:10:22 suss false 2003-01-21 8 2200.0 19.0 > 24444559 1 111 11111 50 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 51 > 2015-11-13 12:10:22 jaie true 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 52 > 2014-09-11 22:10:22 suss false 2013-01-21 8 2200.0 19.0 > 232344559 7 777 777777 53 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 54 > 2015-11-13 12:10:22 jaie false 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 55 > 2014-09-11 22:10:22 suss false 2003-01-21 8 2200.0 19.0 > 24444559 1 111 11111 56 > 2014-09-11 22:10:22 suss false 2013-01-21 8 2200.0 19.0 > 232344559 7 777 777777 57 > 2016-10-12 12:00:00 abcd true 2015-10-10 5 1600.0 26.0 > 123456789 9 999 999999 58 > 2015-11-13 12:10:22 jaie false 2016-11-11 7 1800.0 29.0 > 455756789 8 888 888888 59 > 2014-09-11 22:10:22 suss false 2003-01-21 8 2200.0 19.0 > 24444559 1 111 11111 60 > Below is configuration in hdfs-site.xml > <property> > <name>dfs.namenode.delegation.key.update-interval</name> > <value>18000000</value> > </property> > <property> > <name>dfs.namenode.delegation.token.max-lifetime</name> > <value>60000000</value> > </property> > <property> > Insert gives job aborted Exception: > 0: jdbc:hive2://10.18.18.214:23040/default> show tables; > +-----------+------------+--------------+--+ > | database | tableName | isTemporary | > +-----------+------------+--------------+--+ > | test_one | alt_s1 | false | > | test_one | csvtable | false | > | test_one | csvtable1 | false | > | test_one | csvtable2 | false | > | test_one | emp | false | > | test_one | tab1 | false | > | test_one | tab2 | false | > +-----------+------------+--------------+--+ > 7 rows selected (0.049 seconds) > 0: jdbc:hive2://10.18.18.214:23040/default> insert into emp values( 34 ); > Error: org.apache.spark.SparkException: Job aborted. (state=,code=0) > 0: jdbc:hive2://10.18.18.214:23040/default> > JDBC Server Logs snapshot > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 7 for spark) can't be found in cache > at org.apache.hadoop.ipc.Client.call(Client.java:1475) > at org.apache.hadoop.ipc.Client.call(Client.java:1412) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy15.create(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy16.create(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1648) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1689) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1624) > at > org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448) > at > org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459) > at > org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:804) > at > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:80) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:261) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:246) > ... 16 more > 2018-06-22 14:53:22,557 | ERROR | [HiveServer2-Background-Pool: Thread-2966] > | Error running hive query: | > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:179) > org.apache.hive.service.cli.HiveSQLException: > org.apache.spark.SparkException: Job aborted. > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:269) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org