Hi Everyone, I came across an issue in the OFBiz job polling and scheduling process and would like to get inputs from the community.
There appears to be a race condition where a *recurring job can lose its recurrence (tempExprId)* if the server crashes at a specific point during execution. When a job moves from SERVICE_QUEUED to SERVICE_RUNNING, a crash occurring *before the next recurrence is created* leaves the job in SERVICE_RUNNING state. On restart, JobManager.reloadCrashedJobs() assumes the next recurrence already exists and reschedules the job without tempExprId. Since the recurrence was never actually created, the chain breaks and the job does not run again after the retry. I have a couple of possible fixes in the Job Manager area and am currently evaluating the approach. Kind Regards, Chandan Khandelwal
