Dear All,

We have run into an issue releated to the real-time OLAP feature. At some
point the streaming_job_status_checker-thread started to fail with a
NullPointerException on getting the streaming segment's job build state,
and it never recovers from this issue. As a result, it seems like from this
point on, the streaming segments don't get built, but stay in NEW state.
After some time this also blocks the batch builds (batch builds will not
execute if there are more than 10 new or pending build jobs).

When we restart the the streaming coordinator process, it seems to recover
and the build jobs for these segments start to get built.

This is the only related exception I can see in the logs (every 2 minutes):

21:58:59.677 [streaming_job_status_checker-thread-1] ERROR
org.apache.kylin.stream.coordinator.Coordinator - error when check
streaming segment job build
state:SegmentJobBuildInfo{cubeName='speed_cube',
segmentName='20190820200000_20190820210000',
jobID='10954775-3cd6-325c-6a23-eed7d283daf4', retryCnt=0}
java.lang.NullPointerException
        at
org.apache.kylin.stream.coordinator.Coordinator$StreamingBuildJobStatusChecker.doRun(Coordinator.java:1372)
[kylin-stream-coordinator-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at
org.apache.kylin.stream.coordinator.Coordinator$StreamingBuildJobStatusChecker.run(Coordinator.java:1351)
[kylin-stream-coordinator-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[?:1.8.0_201]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
[?:1.8.0_201]
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
[?:1.8.0_201]
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
[?:1.8.0_201]
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_201]
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_201]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]

Does anyone have any idea what the issue could be? (I see there's a cache
in the CubeManager which does not seem to contain that specific cube
instance and StreamingBuildJobStatusChecker fails with a NPE because of
that.)

Thank you,
Andras

Reply via email to