Hi I am trying to understand the flow between zk and SolrCloud nodes during writes and restarts.
*Writes*: When an indexing job runs , it looks like the leader for every shard is identified from zk and the write requests goes to the leader and then eventually data flows to replicas. Question: How frequently does zk and the nodes need to communicate during the "write" operation? Could this cause a few shards/cores to get flaky (go to down/recovering state at times if its unable to talk to zk?) *Restarts*: The size of data in my zk is around 60Mb and the zk cfg is as follows tickTime=2000 dataDir=/mnt/data/zookeeper clientPort=2181 initLimit=10 syncLimit=15 maxClientCnxns=100 maxSessionTimeout=120000 autopurge.snapRetainCount=3 autopurge.purgeInterval=1 When I restart a node containing say 100 shards, it takes about 20-25 minutes to fully boot up and i find that Solr is *not* bottlenecked on cpu,mem or any other system parameter (There are no auto warm or first searcher queries running). A few shards load faster than the others. Question: Is the above setting good enough to stream 60Mb of data in total from zk node everytime i restart a node? Is there anything I can do to identify if zk is the bottleneck while booting up nodes Thanks Nitin