Hi Scott, It seems we have some issues currently with our job scheduler  which seems to be some sort of memory leak. We are also experiencing some performance issues and other anomalies. It seems like a good time to perhaps revisit the whole thing.
Are you suggesting to replace LinkedBlockingQueue with PriorityBlockingQueue? If so I think it might actually be a better option. I think being unbounded _might_ actually resolve some of the pain points we're facing. I didn't get why it's not a drop-in replacement though. It matches the signature of the call in the executor service unless i'm missing something somewhere?  https://issues.apache.org/jira/browse/OFBIZ-10592 On Wed, Jan 30, 2019 at 10:59 PM Scott Gray <scott.g...@hotwaxsystems.com> wrote: > > Hi folks, > > Just jotting down some issues with the JobManager over noticed over the > last few days: > 1. min-threads in serviceengine.xml is never exceeded unless the job count > in the queue exceeds 5000 (or whatever is configured). Is this not obvious > to anyone else? I don't think this was the behavior prior to a refactoring > a few years ago. > 2. The advice on the number of threads to use doesn't seem good to me, it > assumes your jobs are CPU bound when in my experience they are more likely > to be I/O bound while making db or external API calls, sending emails etc. > With the default setup, it only takes two long running jobs to effectively > block the processing of any others until the queue hits 5000 and the other > threads are finally opened up. If you're not quickly maxing out the queue > then any other jobs are stuck until the slow jobs finally complete. > 3. Purging old jobs doesn't seem to be well implemented to me, from what > I've seen the system is only capable of clearing a few hundred per minute > and if you've filled the queue with them then regular jobs have to queue > behind them and can take many minutes to finally be executed. > > I'm wondering if anyone has experimented with reducing the queue the size? > I'm considering reducing it to say 100 jobs per thread (along with > increasing the thread count). In theory it would reduce the time real jobs > have to sit behind PurgeJobs and would also open up additional threads for > use earlier. > > Alternatively I've pondered trying a PriorityBlockingQueue for the job > queue (unfortunately the implementation is unbounded though so it isn't a > drop-in replacement) so that PurgeJobs always sit at the back of the > queue. It might also allow prioritizing certain "user facing" jobs (such > as asynchronous data imports) over lower priority less time critical jobs. > Maybe another option (or in conjunction) is some sort of "swim-lane" > queue/executor that allocates jobs to threads based on prior execution > speed so that slow running jobs can never use up all threads and block > faster jobs. > > Any thoughts/experiences you have to share would be appreciated. > > Thanks > Scott