Re: Recurring Job Loses Schedule After Crash Due To Race Condition In Job Scheduler

Chandan Khandelwal Fri, 06 Mar 2026 04:07:25 -0800

I have created a JIRA ticket for this issue so it can be tracked and
discussed further:


https://issues.apache.org/jira/browse/OFBIZ-13370

Please feel free to add comments or suggestions there.



Kind Regards,
Chandan Khandelwal




On Wed, Feb 4, 2026 at 5:57 PM Chandan Khandelwal <
[email protected]> wrote:

> Hi Everyone,
>
> I came across an issue in the OFBiz job polling and scheduling process and
> would like to get inputs from the community.
>
> There appears to be a race condition where a *recurring job can lose its
> recurrence (tempExprId)* if the server crashes at a specific point during
> execution.
>
> When a job moves from SERVICE_QUEUED to SERVICE_RUNNING, a crash
> occurring *before the next recurrence is created* leaves the job in
> SERVICE_RUNNING state. On restart, JobManager.reloadCrashedJobs() assumes
> the next recurrence already exists and reschedules the job without
> tempExprId. Since the recurrence was never actually created, the chain
> breaks and the job does not run again after the retry.
>
> I have a couple of possible fixes in the Job Manager area and am currently
> evaluating the approach.
>
> Kind Regards,
> Chandan Khandelwal
>
>

Re: Recurring Job Loses Schedule After Crash Due To Race Condition In Job Scheduler

Reply via email to