Hi Scott, It seems we have some issues currently with our job scheduler [1] which seems to be some sort of memory leak. We are also experiencing some performance issues and other anomalies. It seems like a good time to perhaps revisit the whole thing.
Are you suggesting to replace LinkedBlockingQueue with PriorityBlockingQueue? If so I think it might actually be a better option. I think being unbounded _might_ actually resolve some of the pain points we're facing. I didn't get why it's not a drop-in replacement though. It matches the signature of the call in the executor service unless i'm missing something somewhere? [1] https://issues.apache.org/jira/browse/OFBIZ-10592 On Wed, Jan 30, 2019 at 10:59 PM Scott Gray <scott.g...@hotwaxsystems.com> wrote: > > Hi folks, > > Just jotting down some issues with the JobManager over noticed over the > last few days: > 1. min-threads in serviceengine.xml is never exceeded unless the job count > in the queue exceeds 5000 (or whatever is configured). Is this not obvious > to anyone else? I don't think this was the behavior prior to a refactoring > a few years ago. > 2. The advice on the number of threads to use doesn't seem good to me, it > assumes your jobs are CPU bound when in my experience they are more likely > to be I/O bound while making db or external API calls, sending emails etc. > With the default setup, it only takes two long running jobs to effectively > block the processing of any others until the queue hits 5000 and the other > threads are finally opened up. If you're not quickly maxing out the queue > then any other jobs are stuck until the slow jobs finally complete. > 3. Purging old jobs doesn't seem to be well implemented to me, from what > I've seen the system is only capable of clearing a few hundred per minute > and if you've filled the queue with them then regular jobs have to queue > behind them and can take many minutes to finally be executed. > > I'm wondering if anyone has experimented with reducing the queue the size? > I'm considering reducing it to say 100 jobs per thread (along with > increasing the thread count). In theory it would reduce the time real jobs > have to sit behind PurgeJobs and would also open up additional threads for > use earlier. > > Alternatively I've pondered trying a PriorityBlockingQueue for the job > queue (unfortunately the implementation is unbounded though so it isn't a > drop-in replacement) so that PurgeJobs always sit at the back of the > queue. It might also allow prioritizing certain "user facing" jobs (such > as asynchronous data imports) over lower priority less time critical jobs. > Maybe another option (or in conjunction) is some sort of "swim-lane" > queue/executor that allocates jobs to threads based on prior execution > speed so that slow running jobs can never use up all threads and block > faster jobs. > > Any thoughts/experiences you have to share would be appreciated. > > Thanks > Scott