[
https://issues.apache.org/jira/browse/KYLIN-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930198#comment-17930198
]
Guoliang Sun commented on KYLIN-6059:
-------------------------------------
h3. Root Cause
The error occurred because only the JDBC transaction was enabled externally,
while the UnitOfWork transaction was not enabled, causing the deletion
operation for cached metadata to fail.
h3. Dev Design
Wrap a UnitOfWork transaction around the upper-level business logic to ensure
the successful execution of deletion operations for segments whose status is
neither "ready" nor "warning" during the cancellation of `NSparkCubingJob` and
`NSparkMergeJob`.
> When the model is in a "broken" state, the build task status becomes abnormal
> -----------------------------------------------------------------------------
>
> Key: KYLIN-6059
> URL: https://issues.apache.org/jira/browse/KYLIN-6059
> Project: Kylin
> Issue Type: Bug
> Affects Versions: 5.0.0
> Reporter: Guoliang Sun
> Assignee: Guoliang Sun
> Priority: Major
> Fix For: 5.0.2
>
>
> During the execution of the build task, the fact table of the model was
> deleted, causing the model to enter a "broken" state. The build task should
> have transitioned to the "discard" state via `suicideJob`, but an exception
> occurred instead. The `kylin.log` is as follows:
> {code:java}
> 2024-12-12T14:26:42,651 WARN [JobCheckThreadPool] runners.JobCheckUtil :
> [UNEXPECTED_THINGS_HAPPENED] job e6a33368-6f6f-eac4-8295-62d04d30e443-ce504
> fd8-30e7-67b9-9670-dc67d59e8ecd should be suicidal but discard failed
> org.apache.kylin.common.persistence.metadata.PersistException: persist
> messages failed
> at
> org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTransaction(JdbcUtil.java:149)
> ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTransaction(JdbcUtil.java:122)
> ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTxAndRetry(JdbcUtil.java:84)
> ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTxAndRetry(JdbcUtil.java:64)
> ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.job.util.JobContextUtil.withTxAndRetry(JobContextUtil.java:292)
> ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.job.util.JobContextUtil.withTxAndRetry(JobContextUtil.java:287)
> ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.job.execution.ExecutableManager.suicideJob(ExecutableManager.java:1706)
> ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.job.runners.JobCheckUtil.markSuicideJob(JobCheckUtil.java:99)
> ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.job.runners.JobCheckUtil.markSuicideJob(JobCheckUtil.java:91)
> ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.job.runners.JobCheckRunner.markSuicideForErrorOrPausedJobs(JobCheckRunner.java:134)
> ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.job.runners.JobCheckRunner.run(JobCheckRunner.java:116)
> ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[?:1.8.0_181]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> ~[?:1.8.0_181]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> ~[?:1.8.0_181]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> ~[?:1.8.0_181]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[?:1.8.0_181]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[?:1.8.0_181]
> at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
> Caused by: java.lang.NullPointerException: current thread is not accompanied
> by a UnitOfWork
> at
> org.apache.kylin.guava30.shaded.common.base.Preconditions.checkNotNull(Preconditions.java:897)
> ~[kylin-external-guava30-5.0.0.jar:?]
> at
> org.apache.kylin.common.persistence.transaction.UnitOfWork.get(UnitOfWork.java:227)
> ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.common.persistence.transaction.UnitOfWork.isReadonly(UnitOfWork.java:369)
> ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.common.persistence.InMemResourceStore.checkEnv(InMemResourceStore.java:231)
> ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.common.persistence.InMemResourceStore.deleteResourceImpl(InMemResourceStore.java:178)
> ~[kylin-core-common-5.0.0-SNAPSHOT.j
> ar:?]
> at
> org.apache.kylin.common.persistence.ResourceStore.deleteResource(ResourceStore.java:346)
> ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.metadata.cachesync.CachedCrudAssist.delete(CachedCrudAssist.java:306)
> ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.metadata.cachesync.CachedCrudAssist.delete(CachedCrudAssist.java:297)
> ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?]
> at org.apache.kylin.metadata.Manager.delete(Manager.java:195)
> ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.metadata.cube.model.NDataflowManager.lambda$updateDataflowWithoutIndex$21(NDataflowManager.java:686)
> ~[kylin-core-metadata
> -5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.metadata.cube.model.NDataflowManager.updateDataflow(NDataflowManager.java:600)
> ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.metadata.cube.model.NDataflowManager.updateDataflowWithoutIndex(NDataflowManager.java:661)
> ~[kylin-core-metadata-5.0.0-SNA
> PSHOT.jar:?]
> at
> org.apache.kylin.metadata.cube.model.NDataflowManager.updateDataflow(NDataflowManager.java:648)
> ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.engine.spark.job.NSparkCubingJob.cancelJob(NSparkCubingJob.java:274)
> ~[kylin-engine-spark-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.job.execution.ExecutableManager.suicideJob(ExecutableManager.java:1741)
> ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.job.execution.ExecutableManager.lambda$suicideJob$88(ExecutableManager.java:1708)
> ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
> at
> org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTransaction(JdbcUtil.java:133)
> ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
> ... 17 more {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)