[ 
https://issues.apache.org/jira/browse/IGNITE-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15853652#comment-15853652
 ] 

Dmitry Karachentsev commented on IGNITE-4395:
---------------------------------------------

[review|http://reviews.ignite.apache.org/ignite/review/IGNT-CR-89] 
[PR#1495|https://github.com/apache/ignite/pull/1495]


> Implement communication backpressure per policy - SYSTEM or PUBLIC
> ------------------------------------------------------------------
>
>                 Key: IGNITE-4395
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4395
>             Project: Ignite
>          Issue Type: Improvement
>          Components: cache, compute
>    Affects Versions: 1.7
>            Reporter: Dmitry Karachentsev
>            Assignee: Dmitry Karachentsev
>             Fix For: 1.9
>
>
> 1) Start two data nodes with some cache.
> 2) From one node in async mode post some big number of jobs to another. That 
> jobs do some cache operations.
> 3) Grid hangs almost immediately and all threads are sleeping except public 
> ones, they are waiting for response.
> This happens because all cache and job messages are queued on communication 
> and limited with default number (1024). It looks like jobs are waiting for 
> cache responses that could not be received due to this limit.
> Proper solution here is to have communication backpressure per policy -
> SYSTEM or PUBLIC, but not single point as it is now. It could be achieved
> with having two queues per communication session or (which looks a bit
> easier to implement) to have separate connections.
> [PR#1331|https://github.com/apache/ignite/pull/1331] with test that leads to 
> grid hang.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to