[
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795614#comment-17795614
]
Vipul Thakur commented on IGNITE-21059:
---------------------------------------
Hi Please review and comment and let me know if more info is needed.
> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running
> cache operations
> --------------------------------------------------------------------------------------------
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
> Issue Type: Bug
> Components: binary, clients
> Affects Versions: 2.14
> Reporter: Vipul Thakur
> Priority: Critical
> Attachments: cache-config-1.xml,
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1,
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2,
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3,
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1,
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in
> production environment where cluster would go in hang state due to partition
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened
> smoothly but on the third day we could see cluster traffic dip again.
> We have 4 nodes in a cluster where we provide 400 GB of RAM and more than 1
> TB HDD.
> PFB for the attached config.[I have added it as attachment for review]
> We have set txn timeout as well as socket timeout both at server and client
> end for our write operations but seems like sometimes cluster goes into hang
> state and all our get calls are stuck and slowly everything starts to freeze
> our jms listener threads and every thread reaches a choked up state in
> sometime.
> Due to which our read services which does not even use txn to retrieve data
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till
> now.
>
>
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)