The thing to look for in GC logs would be signs that you’re bouncing against your memory limits and spending a lot of time in full GC collections.
I’m not sure at what phase it kicks in but definitely there is the potential for memory issues when you have large column families (large in the number of columns I mean), and you’re mentioning that the situation gets worse in proportion to the number of tables brought GC to mind. Not sure about proportion of nodes, I think there are thread counts that increase with the number of nodes, and increased threads also can add to GC load, particularly in G1GC. I’m speculating a bit on possible causes, but basically the idea was to look for GC load during those 3 minutes, because if you see it then you’re not hunting for a timeout tuning or anything like that, you’re hunting for a resource allocation tuning. From: Jai Bheemsen Rao Dhanwada <jaibheem...@gmail.com> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> Date: Monday, June 1, 2020 at 7:15 PM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: Re: Cassandra Bootstrap Sequence Message from External Sender Is there anything specific to for in GC logs? b/w this delay happens always whenever I bootstrap the node or restart a C* process. I don't believe it's a GC issue and correction from initial question, it's not just bootstrap, but every restart of C* process is causing this. On Mon, Jun 1, 2020 at 3:22 PM Reid Pinchback <rpinchb...@tripadvisor.com<mailto:rpinchb...@tripadvisor.com>> wrote: That gap seems a long time. Have you checked GC logs around the timeframe? From: Jai Bheemsen Rao Dhanwada <jaibheem...@gmail.com<mailto:jaibheem...@gmail.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Monday, June 1, 2020 at 3:52 PM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Subject: Cassandra Bootstrap Sequence Message from External Sender Hello Team, When I am bootstrapping/restarting a Cassandra Node, there is a delay between gossip settle and port opening. Can someone please explain me where this delay is configured and can this be changed? I don't see any information in the logs In my case if you see there is a ~3 minutes delay and this increases if I increase the #of tables and #of nodes and DC. INFO [main] 2020-05-31 23:51:07,554 Gossiper.java:1692 - Waiting for gossip to settle... INFO [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip backlog; proceeding INFO [main] 2020-05-31 23:54:06,867 NativeTransportService.java:70 - Netty using native Epoll event loop INFO [main] 2020-05-31 23:54:06,913 Server.java:155 - Using Netty Version: [netty-buffer=netty-buffer-4.0.44.Final.452812a, netty-codec=netty-codec-4.0.44.Final.452812a, netty-codec-haproxy=netty-codec-haproxy-4.0.44.Final.452812a, netty-codec-http=netty-codec-http-4.0.44.Final.452812a, netty-codec-socks=netty-codec-socks-4.0.44.Final.452812a, netty-common=netty-common-4.0.44.Final.452812a, netty-handler=netty-handler-4.0.44.Final.452812a, netty-tcnative=netty-tcnative-1.1.33.Fork26.142ecbb, netty-transport=netty-transport-4.0.44.Final.452812a, netty-transport-native-epoll=netty-transport-native-epoll-4.0.44.Final.452812a, netty-transport-rxtx=netty-transport-rxtx-4.0.44.Final.452812a, netty-transport-sctp=netty-transport-sctp-4.0.44.Final.452812a, netty-transport-udt=netty-transport-udt-4.0.44.Final.452812a] INFO [main] 2020-05-31 23:54:06,913 Server.java:156 - Starting listening for CQL clients on /x.x.x.x:9042 (encrypted)... Also during this 3 minutes delay, I am losing all my metrics from the C* nodes(basically the metrics are not returned within 10s). Can someone please help me understand the delay here? Cassandra Version: 3.11.3 Metrics: Using telegraf to collect metrics.