Abacn commented on issue #21696: URL: https://github.com/apache/beam/issues/21696#issuecomment-1287472367
The original failure was due to that the job server artifact of flink 1.12 has long stopped updating (https://pantheon.corp.google.com/gcr/images/apache-beam-testing/global/beam_portability/beam_flink1.12_job_server). Switched to using Flink 1.13, now all jobs are having same error: ``` 04:43:52 2022/10/21 08:43:52 Submitted job: load0tests0go0flink0batch0combine0101021065324-root-1021084351-5f971342_60528140-7d74-4569-a5ec-99335a2f7dfe 04:43:52 2022/10/21 08:43:52 Job state: STOPPED 04:43:52 2022/10/21 08:43:52 Job state: STARTING 04:43:52 2022/10/21 08:43:52 Job state: RUNNING 04:45:01 2022/10/21 08:45:01 (): java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error while waiting for job to be initialized 04:45:01 at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:316) 04:45:01 at org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061) 04:45:01 at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958) 04:45:01 at org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator$BatchTranslationContext.execute(FlinkBatchPortablePipelineTranslator.java:195) 04:45:01 at org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:132) 04:45:01 at org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:99) 04:45:01 at org.apache.beam.runners.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:86) 04:45:01 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) 04:45:01 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) 04:45:01 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) 04:45:01 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 04:45:01 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 04:45:01 at java.lang.Thread.run(Thread.java:750) 04:45:01 Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error while waiting for job to be initialized 04:45:01 at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 04:45:01 at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) 04:45:01 at org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056) 04:45:01 ... 11 more 04:45:01 Caused by: java.lang.RuntimeException: Error while waiting for job to be initialized 04:45:01 at org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160) 04:45:01 at org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82) 04:45:01 at org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73) 04:45:01 at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) 04:45:01 at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) 04:45:01 at java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:457) 04:45:01 at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) 04:45:01 at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) 04:45:01 at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) 04:45:01 at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) 04:45:01 Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Number of retries has been exhausted. 04:45:01 at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 04:45:01 at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) 04:45:01 at org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$null$0(AbstractSessionClusterExecutor.java:83) 04:45:01 at org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:140) 04:45:01 ... 9 more 04:45:01 Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Number of retries has been exhausted. 04:45:01 at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:386) 04:45:01 at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) 04:45:01 at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) 04:45:01 at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 04:45:01 at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:575) 04:45:01 at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:943) 04:45:01 at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456) 04:45:01 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 04:45:01 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 04:45:01 at java.lang.Thread.run(Thread.java:750) 04:45:01 Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.rest.util.RestClientException: Response was neither of the expected type([simple type, class org.apache.flink.runtime.rest.messages.job.JobDetailsInfo]) nor an error. 04:45:01 at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326) 04:45:01 at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338) 04:45:01 at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:925) 04:45:01 at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:967) 04:45:01 at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940) 04:45:01 ... 4 more 04:45:01 Caused by: org.apache.flink.runtime.rest.util.RestClientException: Response was neither of the expected type([simple type, class org.apache.flink.runtime.rest.messages.job.JobDetailsInfo]) nor an error. 04:45:01 at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:502) 04:45:01 at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:466) 04:45:01 at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:966) 04:45:01 ... 5 more 04:45:01 Caused by: org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot map `null` into type `long` (set DeserializationConfig.DeserializationFeature.FAIL_ON_NULL_FOR_PRIMITIVES to 'false' to allow) 04:45:01 at [Source: UNKNOWN; line: -1, column: -1] (through reference chain: org.apache.flink.runtime.rest.messages.job.JobDetailsInfo["maxParallelism"]) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.exc.MismatchedInputException.from(MismatchedInputException.java:63) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.DeserializationContext.reportInputMismatch(DeserializationContext.java:1575) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.std.NumberDeserializers$PrimitiveOrWrapperDeserializer.getNullValue(NumberDeserializers.java:176) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.impl.PropertyValueBuffer._findMissing(PropertyValueBuffer.java:204) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.impl.PropertyValueBuffer.getParameters(PropertyValueBuffer.java:160) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.ValueInstantiator.createFromObjectWith(ValueInstantiator.java:288) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:202) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:520) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1390) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:362) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:195) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:322) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:4569) 04:45:01 at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2867) 04:45:01 at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:475) 04:45:01 ... 7 more 04:45:01 2022/10/21 08:45:01 (): org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot map `null` into type `long` (set DeserializationConfig.DeserializationFeature.FAIL_ON_NULL_FOR_PRIMITIVES to 'false' to allow) 04:45:01 at [Source: UNKNOWN; line: -1, column: -1] (through reference chain: org.apache.flink.runtime.rest.messages.job.JobDetailsInfo["maxParallelism"]) 04:45:01 2022/10/21 08:45:01 Job state: FAILED 04:45:02 2022/10/21 08:45:01 Failed to execute job: job load0tests0go0flink0batch0combine0101021065324-root-1021084351-5f971342_60528140-7d74-4569-a5ec-99335a2f7dfe failed ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
