imbajin commented on code in PR #2937:
URL: https://github.com/apache/hugegraph/pull/2937#discussion_r3323507986
##########
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/task/DistributedTaskScheduler.java:
##########
@@ -316,14 +355,18 @@ protected <V> HugeTask<V> deleteFromDB(Id id) {
@Override
public <V> HugeTask<V> delete(Id id, boolean force) {
- if (!force) {
- // Change status to DELETING, perform the deletion operation
through automatic
- // scheduling.
+ HugeTask<?> task = this.taskWithoutResult(id);
+
+ if (!force && !task.completed()) {
+ // Check task status: can't delete running tasks without force
this.updateStatus(id, null, TaskStatus.DELETING);
Review Comment:
Deleting a running task can race with the task's own cancellation callbacks
and resurrect the task record. In this branch the scheduler cancels the
in-memory task and immediately removes the task vertex from DB, but
`HugeTask.cancel()` invokes `TaskCallable.cancelled()`, and both
`UserJob.cancelled()` / `SysJob.cancelled()` call `save()`. Since `taskDone()`
is now a no-op, a long-running job deleted via `delete(id, false)` can be
removed here and then written back as `CANCELLED` during cancellation cleanup.
Please make deletion win over later task saves, or defer physical deletion
until the runner has fully stopped and can no longer persist the task.
##########
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/core/TaskCoreTest.java:
##########
@@ -76,20 +77,22 @@ public void testTask() throws TimeoutException {
Assert.assertEquals(id, task.id());
Assert.assertFalse(task.completed());
- Assert.assertThrows(IllegalArgumentException.class, () -> {
- scheduler.delete(id, false);
- }, e -> {
- Assert.assertContains("Can't delete incomplete task '88888'",
- e.getMessage());
- });
+ if (scheduler.getClass().equals(StandardTaskScheduler.class)) {
Review Comment:
This PR changes the high-risk `DistributedTaskScheduler` paths for
`cancel()`, `delete()`, `close()`, and restore behavior, but the updated tests
mostly skip distributed-specific assertions or only relax status expectations.
I could not find a test that actually exercises the new distributed
`delete(false)` / running-task cancellation paths, which is where the
DB-delete-versus-callback-save race above lives. Please add distributed
scheduler coverage for at least running task + `delete(false)`, running task +
`cancel()`, and close/cron shutdown behavior.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]