[
https://issues.apache.org/jira/browse/IGNITE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anton Vinogradov updated IGNITE-6940:
-------------------------------------
Description:
Ignite Thread Pools Starvation
Description
This situation can occur if user submits tasks that recursively submit more
tasks and synchronously wait for results. Jobs arrive to worker nodes and are
queued forever since there are no free threads in public pool since all threads
are waiting for job results.
Detection and Solution
Task timeout can be set for tasks, so task gets canceled automatically.
Web Console should provide ability to cancel any task and job from UI.
Report
Timed out tasks and jobs should be reported on Web Console and reported to
logs. We need to introduce new config property to set timeout for reported jobs.
Log record and Web Console should include:
- Master node ID
- Start time
was:There is an existing code in {{IgniteKernal.start()}} that logs warnings
when detects starvation. It should be improved to support more thread pools and
update some metrics.
> Thread Starvation monitoring
> ----------------------------
>
> Key: IGNITE-6940
> URL: https://issues.apache.org/jira/browse/IGNITE-6940
> Project: Ignite
> Issue Type: Improvement
> Reporter: Andrey Kuznetsov
> Assignee: Andrey Kuznetsov
> Labels: iep-7
>
> Ignite Thread Pools Starvation
> Description
> This situation can occur if user submits tasks that recursively submit more
> tasks and synchronously wait for results. Jobs arrive to worker nodes and are
> queued forever since there are no free threads in public pool since all
> threads are waiting for job results.
> Detection and Solution
> Task timeout can be set for tasks, so task gets canceled automatically.
> Web Console should provide ability to cancel any task and job from UI.
> Report
> Timed out tasks and jobs should be reported on Web Console and reported to
> logs. We need to introduce new config property to set timeout for reported
> jobs.
> Log record and Web Console should include:
> - Master node ID
> - Start time
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)