[
https://issues.apache.org/jira/browse/IGNITE-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Karachentsev updated IGNITE-4395:
----------------------------------------
Fix Version/s: (was: 2.0)
2.1
> Implement communication backpressure per policy - SYSTEM or PUBLIC
> ------------------------------------------------------------------
>
> Key: IGNITE-4395
> URL: https://issues.apache.org/jira/browse/IGNITE-4395
> Project: Ignite
> Issue Type: Improvement
> Components: cache, compute
> Affects Versions: 1.7
> Reporter: Dmitry Karachentsev
> Assignee: Dmitry Karachentsev
> Fix For: 2.1
>
>
> 1) Start two data nodes with some cache.
> 2) From one node in async mode post some big number of jobs to another. That
> jobs do some cache operations.
> 3) Grid hangs almost immediately and all threads are sleeping except public
> ones, they are waiting for response.
> This happens because all cache and job messages are queued on communication
> and limited with default number (1024). It looks like jobs are waiting for
> cache responses that could not be received due to this limit.
> Proper solution here is to have communication backpressure per policy -
> SYSTEM or PUBLIC, but not single point as it is now. It could be achieved
> with having two queues per communication session or (which looks a bit
> easier to implement) to have separate connections.
> [PR#1331|https://github.com/apache/ignite/pull/1331] with test that leads to
> grid hang.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)