[
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805720#comment-17805720
]
Vipul Thakur edited comment on IGNITE-21059 at 1/25/24 7:08 AM:
----------------------------------------------------------------
I also ran
*{{control.sh|bat --cache contention 5}}*
*OUTPUT*
JVM_OPTS environment variable is set, but will not be used. To pass JVM options
use CONTROL_JVM_OPTS
JVM_OPTS=-Xms1g -Xmx1g -XX:+AlwaysPreTouch -Djava.net.preferIPv4Stack=true
Jan 11, 2024 10:40:23 PM
org.apache.ignite.internal.client.impl.connection.GridClientNioTcpConnection
<init>
INFO: Client TCP connection established: localhost/127.0.0.1:11211
2024-01-11T22:40:23,579][INFO
][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%|#25%TcpCommunicationSpi%][TcpCommunicationSpi]
Established outgoing communication connection [locAddr=x.x.x.x:41264,
rmtAddr=/x.x.x.x:47100]
2024-01-11T22:40:23,594][INFO
][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%|#26%TcpCommunicationSpi%][TcpCommunicationSpi]
Established outgoing communication connection [locAddr=/x.x.x.x:56674,
rmtAddr=/x.x.x.x:47100]
Jan 11, 2024 10:40:23 PM
org.apache.ignite.internal.client.impl.connection.GridClientNioTcpConnection
close
INFO: Client TCP connection closed: localhost/127.0.0.1:11211
Jan 11, 2024 10:40:23 PM org.apache.ignite.internal.client.util.GridClientUtils
shutdownNow
WARNING: Runnable tasks outlived thread pool executor service
[owner=GridClientConnectionManager,
tasks=[java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@53f65459]]
[node=TcpDiscoveryNode [id=acfd7965-2d2a-498f-aa89-a57da5208cb4,
consistentId=c67390a7-9746-445b-9f40-b98ea32cc1ed, addrs=ArrayList [x.x.x.x
127.0.0.1], sockAddrs=null, discPort=47500, order=90, intOrder=48,
lastExchangeTime=1704993022880, loc=false, ver=2.14.0#20220929-sha1:951e8deb,
isClient=false]]
[node=TcpDiscoveryNode [id=3f5fc804-95f7-4151-809c-ad52c0528806,
consistentId=3204dd77-8571-4c06-a059-aaf2ec06b739, addrs=ArrayList [x.x.x.x
127.0.0.1], sockAddrs=null, discPort=47500, order=88, intOrder=47,
lastExchangeTime=1704993022880, loc=false, ver=2.14.0#20220929-sha1:951e8deb,
isClient=false]]
[node=TcpDiscoveryNode [id=855b22e7-0ad7-4521-ab53-3af65b6fce73,
consistentId=ee70a820-92a5-48c7-a5da-4965c946b550, addrs=ArrayList [x.x.x.x,
127.0.0.1], sockAddrs=null, discPort=47500, order=4, intOrder=4,
lastExchangeTime=1704993022880, loc=false, ver=2.14.0#20220929-sha1:951e8deb,
isClient=false]]
Control utility [ver. 2.14.0#20220929-sha1:951e8deb]
2022 Copyright(C) Apache Software Foundation
Time: 2024-01-11T22:40:22.947
Command [CACHE] started
Arguments: --host localhost --port 11211 --user xxxx --password ***** --cache
contention 5
--------------------------------------------------------------------------------
Command [CACHE] finished with code: 0
Control utility has completed execution at: 2024-01-11T22:40:23.734
Execution time: 787 ms
was (Author: vipul.thakur):
I also ran
*{{control.sh|bat --cache contention 5}}*
*OUTPUT*
JVM_OPTS environment variable is set, but will not be used. To pass JVM options
use CONTROL_JVM_OPTS
JVM_OPTS=-Xms1g -Xmx1g -XX:+AlwaysPreTouch -Djava.net.preferIPv4Stack=true
Jan 11, 2024 10:40:23 PM
org.apache.ignite.internal.client.impl.connection.GridClientNioTcpConnection
<init>
INFO: Client TCP connection established: localhost/127.0.0.1:11211
2024-01-11T22:40:23,579][INFO
][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi]
Established outgoing communication connection [locAddr=/10.135.34.53:41264,
rmtAddr=/10.135.34.68:47100]
2024-01-11T22:40:23,594][INFO
][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi]
Established outgoing communication connection [locAddr=/10.135.34.53:56674,
rmtAddr=/10.135.34.67:47100]
Jan 11, 2024 10:40:23 PM
org.apache.ignite.internal.client.impl.connection.GridClientNioTcpConnection
close
INFO: Client TCP connection closed: localhost/127.0.0.1:11211
Jan 11, 2024 10:40:23 PM org.apache.ignite.internal.client.util.GridClientUtils
shutdownNow
WARNING: Runnable tasks outlived thread pool executor service
[owner=GridClientConnectionManager,
tasks=[java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@53f65459]]
[node=TcpDiscoveryNode [id=acfd7965-2d2a-498f-aa89-a57da5208cb4,
consistentId=c67390a7-9746-445b-9f40-b98ea32cc1ed, addrs=ArrayList
[10.135.34.67, 127.0.0.1], sockAddrs=null, discPort=47500, order=90,
intOrder=48, lastExchangeTime=1704993022880, loc=false,
ver=2.14.0#20220929-sha1:951e8deb, isClient=false]]
[node=TcpDiscoveryNode [id=3f5fc804-95f7-4151-809c-ad52c0528806,
consistentId=3204dd77-8571-4c06-a059-aaf2ec06b739, addrs=ArrayList
[10.135.34.53, 127.0.0.1], sockAddrs=null, discPort=47500, order=88,
intOrder=47, lastExchangeTime=1704993022880, loc=false,
ver=2.14.0#20220929-sha1:951e8deb, isClient=false]]
[node=TcpDiscoveryNode [id=855b22e7-0ad7-4521-ab53-3af65b6fce73,
consistentId=ee70a820-92a5-48c7-a5da-4965c946b550, addrs=ArrayList
[10.135.34.68, 127.0.0.1], sockAddrs=null, discPort=47500, order=4, intOrder=4,
lastExchangeTime=1704993022880, loc=false, ver=2.14.0#20220929-sha1:951e8deb,
isClient=false]]
Control utility [ver. 2.14.0#20220929-sha1:951e8deb]
2022 Copyright(C) Apache Software Foundation
Time: 2024-01-11T22:40:22.947
Command [CACHE] started
Arguments: --host localhost --port 11211 --user xxxx --password ***** --cache
contention 5
--------------------------------------------------------------------------------
Command [CACHE] finished with code: 0
Control utility has completed execution at: 2024-01-11T22:40:23.734
Execution time: 787 ms
> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running
> cache operations
> --------------------------------------------------------------------------------------------
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
> Issue Type: Bug
> Components: binary, clients
> Affects Versions: 2.14
> Reporter: Vipul Thakur
> Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml,
> client-service.zip, digiapi-eventprocessing-app-zone1-6685b8d7f7-ntw27.log,
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1,
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2,
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3,
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1,
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2,
> ignite-server-nohup-1.out, ignite-server-nohup.out, ignite_issue_1101.zip,
> image-2024-01-11-22-28-51-501.png, image.png, long_txn_.png, nohup_12.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in
> production environment where cluster would go in hang state due to partition
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened
> smoothly but on the third day we could see cluster traffic dip again.
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client
> end for our write operations but seems like sometimes cluster goes into hang
> state and all our get calls are stuck and slowly everything starts to freeze
> our jms listener threads and every thread reaches a choked up state in
> sometime.
> Due to which our read services which does not even use txn to retrieve data
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till
> now.
>
>
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)