Hi all,

On our installation (v15.07) we suddenly see that one of two job handlers
get stuck with a high cpu load (last message generally, `cleaning up
external metadata files`) without new messages appearing. In addition, when
running workflows in batch (>6x), only a few of them (~3) get their
workflow steps/jobs scheduled (LSF-DRMAA).  For the remaining 3, their new
histories are created but remain empty (according to the GUI). Only upon
restart of the two job handlers the remaining workflow steps are scheduled
and shown in the history.

First question, how do we resolve this issue?
Second, how does this actually work? How are the workflow steps stored in
the database i.e. why are they not shown in the web interface until they
are processed by a handler?

Possible relevant config settings:
[server:handler0]
use_threadpool = true
threadpool_workers = 5

[server:handler1]
use_threadpool = true
threadpool_workers = 5

[app:main]
force_beta_workflow_scheduled_min_steps=1
force_beta_workflow_scheduled_for_collections=True
track_jobs_in_database = True
enable_job_recovery = True
retry_metadata_internally = False
cache_user_job_count = True # only a limit set for the very few local tools
like upload

Cheers,

Jelle
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Reply via email to