[ 
https://issues.apache.org/jira/browse/CASSANDRA-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17148717#comment-17148717
 ] 

Sylvain Lebresne commented on CASSANDRA-15850:
----------------------------------------------

>From a look at the code, between gossip settling and starting the CQL server, 
>the only thing that happens is that all the tables are "reloaded" (which 
>involves a number of steps) to account for changes that could have happened 
>once Gossip settles, and compactions are started.

None of that shouldn't be super long for a given table, but it's not the most 
optimized thing ever either, and we do reload all tables sequentially, so this 
may well be the culprit for the delay you are seeing.

Assuming I'm correct (I'm only going from a quick read of the code here), I 
don't think any configuration option will help reduce that delay (but it does 
make sense the # of tables is a main factor).

It's not a bug, the server is doing work, albeit maybe inefficiently.

I'm sure this could be improved though. At a minimum, it would be more user 
friendly to add a log message to explain what is being done so users are not 
left wondering what is going on.

I'm sure we can also make that faster. 2 things comes in mind in particular:
 - it seems the only reason to do this reloading is for the compaction 
strategy(ies) to take any disk boundaries change into account, but reloading 
does other things, and a bit of benchmarking could probably tell us if we could 
save meaningful time by doing a more targetted reloading.
 - parallelizing the work might yield benefits.

> Delay between Gossip settle and CQL port opening during the startup
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-15850
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15850
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jai Bheemsen Rao Dhanwada
>            Priority: Normal
>
> Hello,
> When I am bootstrapping/restarting a Cassandra Node, there is a delay between 
> gossip settle and CQL port opening. Can someone please explain me where this 
> delay is configured and can this be changed? I don't see any information in 
> the logs
> In my case if you see there is  a ~3 minutes delay and this increases if I 
> increase the #of tables and #of nodes and DC.
> {code:java}
> INFO  [main] 2020-05-31 23:51:07,554 Gossiper.java:1692 - Waiting for gossip 
> to settle...
> INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip backlog; 
> proceeding
> INFO  [main] 2020-05-31 23:54:06,867 NativeTransportService.java:70 - Netty 
> using native Epoll event loop
> INFO  [main] 2020-05-31 23:54:06,913 Server.java:155 - Using Netty Version: 
> [netty-buffer=netty-buffer-4.0.44.Final.452812a, 
> netty-codec=netty-codec-4.0.44.Final.452812a, 
> netty-codec-haproxy=netty-codec-haproxy-4.0.44.Final.452812a, 
> netty-codec-http=netty-codec-http-4.0.44.Final.452812a, 
> netty-codec-socks=netty-codec-socks-4.0.44.Final.452812a, 
> netty-common=netty-common-4.0.44.Final.452812a, 
> netty-handler=netty-handler-4.0.44.Final.452812a, 
> netty-tcnative=netty-tcnative-1.1.33.Fork26.142ecbb, 
> netty-transport=netty-transport-4.0.44.Final.452812a, 
> netty-transport-native-epoll=netty-transport-native-epoll-4.0.44.Final.452812a,
>  netty-transport-rxtx=netty-transport-rxtx-4.0.44.Final.452812a, 
> netty-transport-sctp=netty-transport-sctp-4.0.44.Final.452812a, 
> netty-transport-udt=netty-transport-udt-4.0.44.Final.452812a]
> INFO  [main] 2020-05-31 23:54:06,913 Server.java:156 - Starting listening for 
> CQL clients on /x.x.x.x:9042 (encrypted)...
> {code}
> Also during this 3-10 minutes delay, I see 
> {noformat}
> nodetool compactionstats
> {noformat}
>  command is hung and never respond, until the CQL port is up and running.
> Can someone please help me understand the delay here?
> Cassandra Version: 3.11.3
> The issue can be easily reproducible with around 300 Tables and 100 nodes in 
> a cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to