alirezazamani commented on a change in pull request #1346:
URL: https://github.com/apache/helix/pull/1346#discussion_r485126373
##########
File path:
helix-core/src/test/java/org/apache/helix/integration/task/TestEnqueueJobs.java
##########
@@ -135,10 +146,18 @@ public void testQueueParallelJobs() throws
InterruptedException {
.setCommand(MockTask.TASK_COMMAND).setMaxAttemptsPerTask(2)
.setJobCommandConfigMap(Collections.singletonMap(MockTask.JOB_DELAY, "10000"));
+ _driver.waitToStop(queueName, 5000L);
+
// Add 4 jobs to the queue
+ List<String> jobNames = new ArrayList<>();
+ List<JobConfig.Builder> jobBuilders = new ArrayList<>();
for (int i = 0; i < numberOfJobsAddedBeforeControllerSwitch; i++) {
- _driver.enqueueJob(queueName, "JOB" + i, jobBuilder);
+ jobNames.add("JOB" + i);
+ jobBuilders.add(jobBuilder);
}
+ _driver.enqueueJobs(queueName, jobNames, jobBuilders);
Review comment:
Yeah. Batching would help. The issue seems to be in the nature of
Helix/ZK if user adds jobs in a row. So if this is the behavior, it is better
to do batch job addition. So let's say user is adding jobs one by one. Let's
say user is job1, job2 and job3.
At T1: Job1 and job2 are added and jobDAG is changed.
At T2: We get children of config and know new configs of job1 and job2 and
change in DAG.
At T3: job3 is added and is being added to DAG.
At T4: Refresh is started and and controller see config of job1, job2 and
DAG will be Job1, job2 and job3.
Now in the pipeline since we see job3 in the DAG and we do not see config,
we purge job3 and remove config from ZK. Hence job3 will not be finished at all.
To answer your question, it depends. It depends on when you resume the
queue, and what part of the logic we are. If we resume the but controller has
not received the notification for new jobConfig, then it can skip scheduling
for first pipeline. But the main issue is that if controller does not see the
jobConfig (which is highly possible when we don't do add job in batch and in a
loop we keep adding jobs one by one), then purge can delete that job (because
controller realized the job is in the DAG but is missing from config) and then
the job never get's finish.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]