[ 
https://issues.apache.org/jira/browse/KYLIN-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930198#comment-17930198
 ] 

Guoliang Sun commented on KYLIN-6059:
-------------------------------------

h3. Root Cause

The error occurred because only the JDBC transaction was enabled externally, 
while the UnitOfWork transaction was not enabled, causing the deletion 
operation for cached metadata to fail.
h3. Dev Design

Wrap a UnitOfWork transaction around the upper-level business logic to ensure 
the successful execution of deletion operations for segments whose status is 
neither "ready" nor "warning" during the cancellation of `NSparkCubingJob` and 
`NSparkMergeJob`.

> When the model is in a "broken" state, the build task status becomes abnormal
> -----------------------------------------------------------------------------
>
>                 Key: KYLIN-6059
>                 URL: https://issues.apache.org/jira/browse/KYLIN-6059
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: 5.0.0
>            Reporter: Guoliang Sun
>            Assignee: Guoliang Sun
>            Priority: Major
>             Fix For: 5.0.2
>
>
> During the execution of the build task, the fact table of the model was 
> deleted, causing the model to enter a "broken" state. The build task should 
> have transitioned to the "discard" state via `suicideJob`, but an exception 
> occurred instead. The `kylin.log` is as follows:
> {code:java}
> 2024-12-12T14:26:42,651 WARN  [JobCheckThreadPool] runners.JobCheckUtil : 
> [UNEXPECTED_THINGS_HAPPENED]  job e6a33368-6f6f-eac4-8295-62d04d30e443-ce504
> fd8-30e7-67b9-9670-dc67d59e8ecd should be suicidal but discard failed
> org.apache.kylin.common.persistence.metadata.PersistException: persist 
> messages failed
>         at 
> org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTransaction(JdbcUtil.java:149)
>  ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTransaction(JdbcUtil.java:122)
>  ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTxAndRetry(JdbcUtil.java:84)
>  ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTxAndRetry(JdbcUtil.java:64)
>  ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.job.util.JobContextUtil.withTxAndRetry(JobContextUtil.java:292)
>  ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.job.util.JobContextUtil.withTxAndRetry(JobContextUtil.java:287)
>  ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.job.execution.ExecutableManager.suicideJob(ExecutableManager.java:1706)
>  ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.job.runners.JobCheckUtil.markSuicideJob(JobCheckUtil.java:99)
>  ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.job.runners.JobCheckUtil.markSuicideJob(JobCheckUtil.java:91)
>  ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.job.runners.JobCheckRunner.markSuicideForErrorOrPausedJobs(JobCheckRunner.java:134)
>  ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.job.runners.JobCheckRunner.run(JobCheckRunner.java:116) 
> ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_181]
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> ~[?:1.8.0_181]
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  ~[?:1.8.0_181]
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  ~[?:1.8.0_181]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_181]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_181]
>         at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
> Caused by: java.lang.NullPointerException: current thread is not accompanied 
> by a UnitOfWork
>         at 
> org.apache.kylin.guava30.shaded.common.base.Preconditions.checkNotNull(Preconditions.java:897)
>  ~[kylin-external-guava30-5.0.0.jar:?]
>         at 
> org.apache.kylin.common.persistence.transaction.UnitOfWork.get(UnitOfWork.java:227)
>  ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.common.persistence.transaction.UnitOfWork.isReadonly(UnitOfWork.java:369)
>  ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.common.persistence.InMemResourceStore.checkEnv(InMemResourceStore.java:231)
>  ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.common.persistence.InMemResourceStore.deleteResourceImpl(InMemResourceStore.java:178)
>  ~[kylin-core-common-5.0.0-SNAPSHOT.j
> ar:?]
>         at 
> org.apache.kylin.common.persistence.ResourceStore.deleteResource(ResourceStore.java:346)
>  ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.metadata.cachesync.CachedCrudAssist.delete(CachedCrudAssist.java:306)
>  ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.metadata.cachesync.CachedCrudAssist.delete(CachedCrudAssist.java:297)
>  ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?]
>         at org.apache.kylin.metadata.Manager.delete(Manager.java:195) 
> ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.metadata.cube.model.NDataflowManager.lambda$updateDataflowWithoutIndex$21(NDataflowManager.java:686)
>  ~[kylin-core-metadata
> -5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.metadata.cube.model.NDataflowManager.updateDataflow(NDataflowManager.java:600)
>  ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.metadata.cube.model.NDataflowManager.updateDataflowWithoutIndex(NDataflowManager.java:661)
>  ~[kylin-core-metadata-5.0.0-SNA
> PSHOT.jar:?]
>         at 
> org.apache.kylin.metadata.cube.model.NDataflowManager.updateDataflow(NDataflowManager.java:648)
>  ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.engine.spark.job.NSparkCubingJob.cancelJob(NSparkCubingJob.java:274)
>  ~[kylin-engine-spark-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.job.execution.ExecutableManager.suicideJob(ExecutableManager.java:1741)
>  ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.job.execution.ExecutableManager.lambda$suicideJob$88(ExecutableManager.java:1708)
>  ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTransaction(JdbcUtil.java:133)
>  ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?]
>         ... 17 more {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to