[
https://issues.apache.org/jira/browse/FLINK-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Zimmer updated FLINK-4742:
-------------------------------
Description:
We have a test that otherwise passed on our CI server, but failed with an NPE
on the line in {{WindowOperator#trigger()}} that removes an item from the
processing time queue (stack trace lined up with
{{processingTimeTimersQueue.remove();}}. Seems like a timing issue... Can
close or dispose be called in a shutdown thread while trigger is executing?
Unless we had a source/artifact mismatch, {{processingTimeTimersQueue.peek()}}
was called a few lines before successfully, so it appears
{{processingTimeTimersQueue}} was nulled out after.
Could not reproduce locally, so I couldn't validate that for certain.
{code}
Stacktrace
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:830)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:773)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:773)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: TimerException{java.lang.NullPointerException}
at
org.apache.flink.streaming.runtime.tasks.DefaultTimeServiceProvider$TriggerTask.run(DefaultTimeServiceProvider.java:101)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at
org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.trigger(WindowOperator.java:477)
at
org.apache.flink.streaming.runtime.tasks.DefaultTimeServiceProvider$TriggerTask.run(DefaultTimeServiceProvider.java:99)
... 7 more
{code}
was:
We have a test that otherwise passed on our CI server, but failed with an NPE
on the line in {{WindowOperator#trigger()}} that removes an item from the
processing time queue (stack trace lined up with
{{processingTimeTimersQueue.remove();}}. Seems like a timing issue... Can
close or dispose be called in a shutdown thread while trigger is executing?
Unless we had a source/artifact mismatch, {{processingTimeTimersQueue.peek()}}
was called a few lines before successfully, so it appears
{{processingTimeTimersQueue}} was nulled out after.
Could not reproduce locally, so I couldn't validate that for certain.
> NPE in WindowOperator.trigger() on shutdown
> -------------------------------------------
>
> Key: FLINK-4742
> URL: https://issues.apache.org/jira/browse/FLINK-4742
> Project: Flink
> Issue Type: Bug
> Components: Windowing Operators
> Affects Versions: 1.2.0
> Reporter: Matt Zimmer
> Priority: Minor
>
> We have a test that otherwise passed on our CI server, but failed with an NPE
> on the line in {{WindowOperator#trigger()}} that removes an item from the
> processing time queue (stack trace lined up with
> {{processingTimeTimersQueue.remove();}}. Seems like a timing issue... Can
> close or dispose be called in a shutdown thread while trigger is executing?
> Unless we had a source/artifact mismatch,
> {{processingTimeTimersQueue.peek()}} was called a few lines before
> successfully, so it appears {{processingTimeTimersQueue}} was nulled out
> after.
> Could not reproduce locally, so I couldn't validate that for certain.
> {code}
> Stacktrace
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:830)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:773)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:773)
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
> at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
> at
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: TimerException{java.lang.NullPointerException}
> at
> org.apache.flink.streaming.runtime.tasks.DefaultTimeServiceProvider$TriggerTask.run(DefaultTimeServiceProvider.java:101)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.trigger(WindowOperator.java:477)
> at
> org.apache.flink.streaming.runtime.tasks.DefaultTimeServiceProvider$TriggerTask.run(DefaultTimeServiceProvider.java:99)
> ... 7 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)