justinmclean opened a new issue, #10271:
URL: https://github.com/apache/gravitino/issues/10271
### What would you like to be improved?
JobManager.runJob submits the job to jobExecutor before persisting JobEntity
in entityStore. If entityStore.put(...) throws IOException, the method throws,
but the submitted job remains queued/running. Because lifecycle reconciliation
and cleanup operate on persisted JobEntity records, this leaves an orphaned
execution that cannot be tracked or managed through normal APIs.
### How should we improve?
Add compensation logic in runJob: if submitJob succeeds but entityStore.put
fails, attempt jobExecutor.cancelJob(jobExecutionId) before rethrowing.
Preserve original failure semantics (throw the persistence failure) while
logging rollback success/failure. Optionally attempt staging-directory cleanup
in this failure path as best effort. This makes submit+persist effectively
atomic from the caller’s perspective and prevents untracked running jobs.
Here a test to help:
```
@Test
public void testRunJobShouldCancelSubmittedJobWhenStorePutFails() throws
IOException {
mockedMetalake
.when(() -> MetalakeManager.checkMetalake(metalakeIdent, entityStore))
.thenAnswer(a -> null);
JobTemplateEntity shellJobTemplate =
newShellJobTemplateEntity("shell_job", "A shell job template");
when(jobManager.getJobTemplate(metalake,
shellJobTemplate.name())).thenReturn(shellJobTemplate);
String jobExecutionId = "job_execution_id_for_test";
when(jobExecutor.submitJob(any())).thenReturn(jobExecutionId);
doThrow(new IOException("Entity store error"))
.when(entityStore)
.put(any(JobEntity.class), anyBoolean());
Assertions.assertThrows(
RuntimeException.class, () -> jobManager.runJob(metalake, "shell_job",
Collections.emptyMap()));
verify(jobExecutor, times(1)).cancelJob(jobExecutionId);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]