Hi Scott,

It seems we have some issues currently with our job scheduler [1]
which seems to be some sort of memory leak. We are also experiencing
some performance issues and other anomalies. It seems like a good time
to perhaps revisit the whole thing.

Are you suggesting to replace LinkedBlockingQueue with
PriorityBlockingQueue? If so I think it might actually be a better
option. I think being unbounded _might_ actually resolve some of the
pain points we're facing. I didn't get why it's not a drop-in
replacement though. It matches the signature of the call in the
executor service unless i'm missing something somewhere?

[1] https://issues.apache.org/jira/browse/OFBIZ-10592

On Wed, Jan 30, 2019 at 10:59 PM Scott Gray
<scott.g...@hotwaxsystems.com> wrote:
>
> Hi folks,
>
> Just jotting down some issues with the JobManager over noticed over the
> last few days:
> 1. min-threads in serviceengine.xml is never exceeded unless the job count
> in the queue exceeds 5000 (or whatever is configured).  Is this not obvious
> to anyone else?  I don't think this was the behavior prior to a refactoring
> a few years ago.
> 2. The advice on the number of threads to use doesn't seem good to me, it
> assumes your jobs are CPU bound when in my experience they are more likely
> to be I/O bound while making db or external API calls, sending emails etc.
> With the default setup, it only takes two long running jobs to effectively
> block the processing of any others until the queue hits 5000 and the other
> threads are finally opened up.  If you're not quickly maxing out the queue
> then any other jobs are stuck until the slow jobs finally complete.
> 3. Purging old jobs doesn't seem to be well implemented to me, from what
> I've seen the system is only capable of clearing a few hundred per minute
> and if you've filled the queue with them then regular jobs have to queue
> behind them and can take many minutes to finally be executed.
>
> I'm wondering if anyone has experimented with reducing the queue the size?
> I'm considering reducing it to say 100 jobs per thread (along with
> increasing the thread count).  In theory it would reduce the time real jobs
> have to sit behind PurgeJobs and would also open up additional threads for
> use earlier.
>
> Alternatively I've pondered trying a PriorityBlockingQueue for the job
> queue (unfortunately the implementation is unbounded though so it isn't a
> drop-in replacement) so that PurgeJobs always sit at the back of the
> queue.  It might also allow prioritizing certain "user facing" jobs (such
> as asynchronous data imports) over lower priority less time critical jobs.
> Maybe another option (or in conjunction) is some sort of "swim-lane"
> queue/executor that allocates jobs to threads based on prior execution
> speed so that slow running jobs can never use up all threads and block
> faster jobs.
>
> Any thoughts/experiences you have to share would be appreciated.
>
> Thanks
> Scott

Reply via email to