Hi, 之前的问题还是没有搞定,不过现象更明晰了点,
版本:flink-1.15.1 场景:写hive数据的时候,写完提交分区,会异常 错误日志: Caused by: java.io.FileNotFoundException: /tmp/jm_253c182f914fb67750844d2e71864a5a/blobStorage/job_615800b00c211de674f17e46938daeb7/blob_p-a813f094892f1c71b7884d0aec7972edbeae08e3-65d1205985504738577e6a7d90385f17 (没有那个文件或目录) at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.<init>(ZipFile.java:228) at java.util.zip.ZipFile.<init>(ZipFile.java:157) at java.util.jar.JarFile.<init>(JarFile.java:171) at java.util.jar.JarFile.<init>(JarFile.java:108) at sun.net.www.protocol.jar.URLJarFile.<init>(URLJarFile.java:93) at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69) at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:99) at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122) at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:152) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2943) at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3034) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) ... 54 more 这里应该是要访问hive-site,jobmanager访问本地路径/tmp/jm_253c182f914fb67750844d2e71864a5a/blobStorage/job_615800b00c211de674f17e46938daeb7/blob_p-a813f094892f1c71b7884d0aec7972edbeae08e3-65d1205985504738577e6a7d90385f17找hive-site,但是这个路径是不存在的,导致异常 这个路径曾经存在过,路径中的job id=615800b00c211de674f17e46938daeb7是历史执行的一次任务,执行完之后,job_615800b00c211de674f17e46938daeb7这个目录就没了 但是后面新启动的任务全部都还是在这个路径下查找hive配置,导致异常 如果重启集群的话,同样的任务提交,不会报错,看起来是个概率事件,所以这个问题可能是什么原因导致的呢? Thanks 在 2022-07-21 14:52:51,"RS" <tinyshr...@163.com> 写道: >Hi, > > >环境: >flink-1.15.1 on K8S session集群 >hive3 >flink写hive任务,配置了定时提交分区 > > >现象: >1. checkpoint是30s一次 >2. HDFS上有数据文件产生 >3. hive里面没有分区信息 >4. 任务异常,自动重启后下次ck的时候还是异常 >5. 写hive的任务有的一直正常运行,有的有这种异常 >6. 任务停掉,重新创建后恢复正常 > > >异常日志如下: >java.lang.RuntimeException: java.io.FileNotFoundException: >/tmp/tm_10.244.25.164:6122-5b9301/blobStorage/job_ac640bf7276279f0452642918561670e/blob_p-d9755afa943119c325c059ddc70a45c904d9e4bd-d3a2ca82ca3a2429095708d4908ae184 > (No such file or directory) > > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3021) > > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2973) > > at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) > > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1460) > > at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5001) > > at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5074) > > at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5161) > > at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:5114) > > at > org.apache.flink.connectors.hive.util.HiveConfUtils.create(HiveConfUtils.java:38) > > at > org.apache.flink.connectors.hive.HiveTableMetaStoreFactory$HiveTableMetaStore.<init>(HiveTableMetaStoreFactory.java:72) > > at > org.apache.flink.connectors.hive.HiveTableMetaStoreFactory$HiveTableMetaStore.<init>(HiveTableMetaStoreFactory.java:64) > > at > org.apache.flink.connectors.hive.HiveTableMetaStoreFactory.createTableMetaStore(HiveTableMetaStoreFactory.java:61) > > at > org.apache.flink.connectors.hive.HiveTableMetaStoreFactory.createTableMetaStore(HiveTableMetaStoreFactory.java:43) > > at > org.apache.flink.connector.file.table.stream.PartitionCommitter.commitPartitions(PartitionCommitter.java:159) > > at > org.apache.flink.connector.file.table.stream.PartitionCommitter.processElement(PartitionCommitter.java:145) > > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233) > > at > org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134) > > at > org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105) > > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:519) > > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:203) > > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804) > > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753) > > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948) > > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741) > > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563) > > at java.lang.Thread.run(Thread.java:750) > >Caused by: java.io.FileNotFoundException: >/tmp/tm_10.244.25.164:6122-5b9301/blobStorage/job_ac640bf7276279f0452642918561670e/blob_p-d9755afa943119c325c059ddc70a45c904d9e4bd-d3a2ca82ca3a2429095708d4908ae184 > (No such file or directory) > > at java.util.zip.ZipFile.open(Native Method) > > at java.util.zip.ZipFile.<init>(ZipFile.java:228) > > at java.util.zip.ZipFile.<init>(ZipFile.java:157) > > at java.util.jar.JarFile.<init>(JarFile.java:171) > > at java.util.jar.JarFile.<init>(JarFile.java:108) > > at sun.net.www.protocol.jar.URLJarFile.<init>(URLJarFile.java:93) > > at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69) > > at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:99) > > at > sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122) > > at > sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:152) > > at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2943) > > at > org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3034) > > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) > > ... 27 more > > > >Thanks