So you use Solr operator for the solr nodes and home-made ZK setup?

I normally use the operator for solr nodes and an open source build of bitnami 
zookeeper helm chart 
(https://github.com/didx-xyz/bitnami-charts/tree/publish/bitnami/zookeeper). 
The chart handles some best practices for zookeeper. You could peek at it for 
inspiration if you want to roll your own.

I have chosen to set ZK_HOST to point to each POD specifically, through 
solr-zookeeper-0.solr-zookeeper-headless.search.svc.cluster.local:2181,solr-zookeeper-1.solr-zookeeper-headless.search.svc.cluster.local:2181,solr-zookeeper-2.solr-zookeeper-headless.search.svc.cluster.local:2181/solr
 - I believe it will make the three connections that each solr pod have to each 
zookeeper more persistent. ZK client is designed in such a way that solr is 
supposed to have live connection to all three servers, and once the ACTIVE 
connection falls over, it will immediately switch to one of the other two, 
without having to re-establish any network connection. I don't think you get 
that kind of connection through an ordinary service.

Jan

> 13. mai 2026 kl. 20:38 skrev Liam Newton <[email protected]>:
> 
> Hi,
> 
> We currently do not handle Zookeeper through the Solr Operator. We handle 
> through a statefulset with a service in front as mentioned by you.  The 
> service is with headless.
> 
> We have implemented a disruption budget with Anti Affinity rules so that 
> Zookeeper cannot have more than one pod taken down at each time.
> 
> So, when we are experiencing these issues it's when one singular pod is taken 
> down which is usually the leader of zookeeper.
> 
> 
> Here is a brief log that appears when a Zookeeper is terminated:
> 
> Starting Solr
> [0.002s][warning][pagesize] UseLargePages disabled, no large pages configured 
> and available on the system.
> WARNING: Using incubator modules: jdk.incubator.vector
> WARNING: A terminally deprecated method in sun.misc.Unsafe has been called
> WARNING: sun.misc.Unsafe::arrayBaseOffset has been called by 
> com.lmax.disruptor.RingBufferFields 
> (file:/opt/solr-10.0.0/server/lib/ext/disruptor-3.4.4.jar)
> WARNING: Please consider reporting this to the maintainers of class 
> com.lmax.disruptor.RingBufferFields
> WARNING: sun.misc.Unsafe::arrayBaseOffset will be removed in a future release
> 2026-05-11 10:37:27.246 WARN  (main) [] o.a.s.c.u.EnvUtils You are passing in 
> deprecated system property host and should upgrade to using 
> solr.host.advertise instead.  The deprecated property support will be removed 
> in future version of Solr.
> 2026-05-11 10:37:27.388 WARN  (main) [] o.a.s.c.c.SolrZkClient Using default 
> ZkCredentialsInjector. ZkCredentialsInjector is not secure, it creates an 
> empty list of credentials which leads to 'OPEN_ACL_UNSAFE' ACLs to Zookeeper 
> nodes
> 2026-05-11 10:37:27.389 WARN  (main) [] o.a.s.c.c.SolrZkClient Using default 
> ZkACLProvider. DefaultZkACLProvider is not secure, it creates 
> 'OPEN_ACL_UNSAFE' ACLs to Zookeeper nodes
> 2026-05-11 10:37:28.908 WARN  (main) [] o.a.s.c.c.SolrZkClient Using default 
> ZkCredentialsInjector. ZkCredentialsInjector is not secure, it creates an 
> empty list of credentials which leads to 'OPEN_ACL_UNSAFE' ACLs to Zookeeper 
> nodes
> 2026-05-11 10:37:28.908 WARN  (main) [] o.a.s.c.c.SolrZkClient Using default 
> ZkACLProvider. DefaultZkACLProvider is not secure, it creates 
> 'OPEN_ACL_UNSAFE' ACLs to Zookeeper nodes
> 2026-05-11 10:37:29.135 WARN  (main) [] o.a.s.c.ZkController Contents of 
> zookeeper /security.json are world-readable; consider setting up ACLs as 
> described in 
> https://solr.apache.org/guide/solr/latest/deployment-guide/zookeeper-access-control.html
> 2026-05-11 10:37:29.374 WARN  (main) [] o.a.s.c.CoreContainer Not all 
> security plugins configured!  authentication=disabled authorization=disabled. 
>  Solr is only as secure as you make it. Consider configuring 
> authentication/authorization before exposing Solr to users internal or 
> external.  See https://s.apache.org/solrsecurity for more info
> 2026-05-11 10:37:30.081 WARN  (main) [] o.g.j.m.i.MessagingBinders A class 
> jakarta.activation.DataSource for a default provider 
> MessageBodyWriter<jakarta.activation.DataSource> was not found. The provider 
> is not available.
> 2026-05-11 10:38:11.756 ERROR 
> (updateExecutor-8-thread-1-processing-solrcloud-0.solrcloud-service:8983_solr 
> test2_shard1_replica_n2 solrcloud-0.solrcloud-service-18 test2 shard1 
> core_node4) [c:test2 s:shard1 r:core_node4 x:test2_shard1_replica_n2] 
> o.a.s.u.SolrCmdDistributor Exception making request 
> =>org.apache.solr.client.solrj.SolrServerException: IOException occurred when 
> talking to server at: 
> http://solrcloud-1.solrcloud-service:8983/solr/test2_shard1_replica_n1/update
>       at 
> org.apache.solr.client.solrj.jetty.HttpJettySolrClient.request(HttpJettySolrClient.java:500)
> 
> 2026-05-11 10:38:11.756 ERROR 
> (updateExecutor-8-thread-3-processing-solrcloud-0.solrcloud-service:8983_solr 
> attemp3_shard1_replica_n2 solrcloud-0.solrcloud-service-19 attemp3 shard1 
> core_node4) [c:attemp3 s:shard1 r:core_node4 x:attemp3_shard1_replica_n2] 
> o.a.s.u.SolrCmdDistributor Exception making request 
> =>org.apache.solr.client.solrj.SolrServerException: IOException occurred when 
> talking to server at: 
> http://solrcloud-1.solrcloud-service:8983/solr/attemp3_shard1_replica_n1/update
>       at 
> org.apache.solr.client.solrj.jetty.HttpJettySolrClient.request(HttpJettySolrClient.java:500)
> 
> 2026-05-11 10:38:11.756 ERROR 
> (updateExecutor-8-thread-2-processing-solrcloud-0.solrcloud-service:8983_solr 
> weoroo_shard1_replica_n2 solrcloud-0.solrcloud-service-17 weoroo shard1 
> core_node4) [c:weoroo s:shard1 r:core_node4 x:weoroo_shard1_replica_n2] 
> o.a.s.u.SolrCmdDistributor Exception making request 
> =>org.apache.solr.client.solrj.SolrServerException: IOException occurred when 
> talking to server at: 
> http://solrcloud-1.solrcloud-service:8983/solr/weoroo_shard1_replica_n1/update
>       at 
> org.apache.solr.client.solrj.jetty.HttpJettySolrClient.request(HttpJettySolrClient.java:500)
> 
> 2026-05-11 10:38:11.758 WARN  
> (qtp1745043985-36-solrcloud-0.solrcloud-service-18) [c:test2 s:shard1 
> r:core_node4 x:test2_shard1_replica_n2] 
> o.a.s.u.p.DistributedZkUpdateProcessor Error sending update to 
> http://solrcloud-1.solrcloud-service:8983/solr 
> =>org.apache.solr.client.solrj.SolrServerException: IOException occurred when 
> talking to server at: 
> http://solrcloud-1.solrcloud-service:8983/solr/test2_shard1_replica_n1/update
>       at 
> org.apache.solr.client.solrj.jetty.HttpJettySolrClient.request(HttpJettySolrClient.java:500)
> 
> 2026-05-11 10:38:11.758 WARN  
> (qtp1745043985-37-solrcloud-0.solrcloud-service-19) [c:attemp3 s:shard1 
> r:core_node4 x:attemp3_shard1_replica_n2] 
> o.a.s.u.p.DistributedZkUpdateProcessor Error sending update to 
> http://solrcloud-1.solrcloud-service:8983/solr 
> =>org.apache.solr.client.solrj.SolrServerException: IOException occurred when 
> talking to server at: 
> http://solrcloud-1.solrcloud-service:8983/solr/attemp3_shard1_replica_n1/update
>       at 
> org.apache.solr.client.solrj.jetty.HttpJettySolrClient.request(HttpJettySolrClient.java:500)
> 
> 2026-05-11 10:38:11.758 WARN  
> (qtp1745043985-35-solrcloud-0.solrcloud-service-17) [c:weoroo s:shard1 
> r:core_node4 x:weoroo_shard1_replica_n2] 
> o.a.s.u.p.DistributedZkUpdateProcessor Error sending update to 
> http://solrcloud-1.solrcloud-service:8983/solr 
> =>org.apache.solr.client.solrj.SolrServerException: IOException occurred when 
> talking to server at: 
> http://solrcloud-1.solrcloud-service:8983/solr/weoroo_shard1_replica_n1/update
>       at 
> org.apache.solr.client.solrj.jetty.HttpJettySolrClient.request(HttpJettySolrClient.java:500)
> 
> 2026-05-11 10:39:29.690 ERROR 
> (qtp1745043985-35-solrcloud-0.solrcloud-service-47) [] 
> o.a.s.h.RequestHandlerBase Client exception 
> =>org.apache.solr.common.SolrException: No such cluster property 
> [overseerEnabled]
>       at 
> org.apache.solr.handler.admin.api.ClusterProperty.getClusterProperty(ClusterProperty.java:92)
> 
> 2026-05-11 10:40:19.252 ERROR 
> (qtp1745043985-35-solrcloud-0.solrcloud-service-66) [] 
> o.a.s.h.RequestHandlerBase Client exception 
> =>org.apache.solr.common.SolrException: Unknown action: OVERSEERSTATUS"
>       at 
> org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:280)
> 
> 2026-05-11 10:42:42.311 WARN  
> (main-SendThread(zookeeper-1.zookeeper-headless.solr.svc.cluster.local:2181)) 
> [] o.a.z.ClientCnxn Session 0x20006a16d230001 for server 
> zookeeper-1.zookeeper-headless.solr.svc.cluster.local/10.101.80.152:2181, 
> Closing socket connection. Attempting reconnect except it is a 
> SessionExpiredException or SessionTimeoutException. 
> =>org.apache.zookeeper.ClientCnxn$EndOfStreamException: Unable to read 
> additional data from server sessionid 0x20006a16d230001, likely server has 
> closed socket
>       at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77)
> 
> 2026-05-11 10:42:42.419 WARN  
> (OverseerCollectionConfigSetProcessor-144122478466367489-solrcloud-0.solrcloud-service:8983_solr-n_0000000801)
>  [] i.o.s.m.SdkObservableInstrument 
> CallbackRegistration{instrumentDescriptors=[InstrumentDescriptor{name=solr_overseer_collection_work_queue_size,
>  description=Size of overseer's collection work queue, unit=, 
> type=OBSERVABLE_GAUGE, valueType=LONG, 
> advice=Advice{explicitBucketBoundaries=null, attributes=null}}]} has called 
> close() multiple times.
> 2026-05-11 10:42:42.721 WARN  
> (main-SendThread(zookeeper-0.zookeeper-headless.solr.svc.cluster.local:2181)) 
> [] o.a.z.ClientCnxn Session 0x20006a16d230001 for server 
> zookeeper-0.zookeeper-headless.solr.svc.cluster.local/10.101.80.182:2181, 
> Closing socket connection. Attempting reconnect except it is a 
> SessionExpiredException or SessionTimeoutException. 
> =>org.apache.zookeeper.ClientCnxn$EndOfStreamException: Unable to read 
> additional data from server sessionid 0x20006a16d230001, likely server has 
> closed socket
>       at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77)
> 
> 2026-05-11 10:42:43.067 WARN  
> (main-SendThread(zookeeper-2.zookeeper-headless.solr.svc.cluster.local:2181)) 
> [] o.a.z.ClientCnxn Session 0x20006a16d230001 for server 
> zookeeper-2.zookeeper-headless.solr.svc.cluster.local/10.101.80.221:2181, 
> Closing socket connection. Attempting reconnect except it is a 
> SessionExpiredException or SessionTimeoutException. 
> =>java.net.NoRouteToHostException: No route to host
>       at java.base/sun.nio.ch.Net.pollConnect(Native Method)
> 
> 2026-05-11 10:42:44.529 WARN  (OverseerExitThread) [] o.a.s.c.Overseer I 
> (id=144122478466367489-solrcloud-0.solrcloud-service:8983_solr-n_0000000801) 
> am exiting, but I'm still the leader
> 2026-05-11 10:42:44.839 ERROR 
> (qtp1745043985-34-solrcloud-0.solrcloud-service-128) [] 
> o.a.s.h.RequestHandlerBase Client exception 
> =>org.apache.solr.common.SolrException: We are not the leader
>       at 
> org.apache.solr.handler.admin.PrepRecoveryOp.lambda$execute$0(PrepRecoveryOp.java:98)
> 
> 2026-05-11 10:42:44.840 ERROR 
> (qtp1745043985-38-solrcloud-0.solrcloud-service-129) [] 
> o.a.s.h.RequestHandlerBase Client exception 
> =>org.apache.solr.common.SolrException: We are not the leader
>       at 
> org.apache.solr.handler.admin.PrepRecoveryOp.lambda$execute$0(PrepRecoveryOp.java:98)
> 
> 2026-05-11 10:42:44.842 ERROR 
> (qtp1745043985-35-solrcloud-0.solrcloud-service-130) [] 
> o.a.s.h.RequestHandlerBase Client exception 
> =>org.apache.solr.common.SolrException: We are not the leader
>       at 
> org.apache.solr.handler.admin.PrepRecoveryOp.lambda$execute$0(PrepRecoveryOp.java:98)
> 
> 2026-05-11 10:42:50.440 WARN  
> (recoveryExecutor-11-thread-2-processing-solrcloud-0.solrcloud-service:8983_solr
>  attemp3_shard1_replica_n2 attemp3 shard1 core_node4) [c:attemp3 s:shard1 
> r:core_node4 x:attemp3_shard1_replica_n2] o.a.s.u.PeerSyncWithLeader no frame 
> of reference to tell if we've missed updates
> 2026-05-11 10:42:50.440 WARN  
> (recoveryExecutor-11-thread-3-processing-solrcloud-0.solrcloud-service:8983_solr
>  weoroo_shard1_replica_n2 weoroo shard1 core_node4) [c:weoroo s:shard1 
> r:core_node4 x:weoroo_shard1_replica_n2] o.a.s.u.PeerSyncWithLeader no frame 
> of reference to tell if we've missed updates
> 2026-05-11 10:42:50.445 WARN  
> (recoveryExecutor-11-thread-1-processing-solrcloud-0.solrcloud-service:8983_solr
>  test2_shard1_replica_n2 test2 shard1 core_node4) [c:test2 s:shard1 
> r:core_node4 x:test2_shard1_replica_n2] o.a.s.u.PeerSyncWithLeader no frame 
> of reference to tell if we've missed updates
> 
> 
> Thanks,
> 
> 
> Liam Newton
> 
> Mobile: 07500270375
> 
> Email: [email protected]<mailto:[email protected]>
> 
> Platform Engineer
> 
> 
> 
> 
> ________________________________
> From: Jan Høydahl <[email protected]>
> Sent: Wednesday, May 13, 2026 6:49 PM
> To: [email protected] <[email protected]>
> Subject: [EXTERNAL] Re: solr v10 collection recovery issues
> 
> CAUTION: This email originated from outside of FISCAL Technologies. Do not 
> click links, enter any credentials or open attachments unless you recognise 
> the sender and know the content is safe.
> 
> Hi,
> 
> How do you handle ZK in k8s? Apparently not through SolrOperator. Do you use 
> statefulset with a service in front? Normal service or headless? Have you 
> configured the STS so that k8s is not allowed to take down more than one of 
> the pod replicas at a time, and have you spread them across unique k8s nodes? 
> Many questions... Would be helpful with more details, config, logs. Solr does 
> not like the entire ZK being unavailable..
> 
> Jan
> 
>> 13. mai 2026 kl. 19:37 skrev Liam Newton <[email protected]>:
>> 
>> Hello,
>> 
>> I am new to this, but I was hoping to raise a bug/ ask for some advice.
>> 
>> We have upgraded our solr cluster to version 10 in the past two months and 
>> noticed that when we restart our Zookeeper instances in particular the 
>> leader instance that solr has issues managing its collections.
>> 
>> What we experience is that collections will go into the degraded state and 
>> stay in such a state till they recover. However, recovery can be dependent 
>> on the number of collections. For example, 10 collections with 2 replicas 
>> each can take up to 5 minutes, and the time increases exponentially 
>> dependent on the number of replicas. Sometime a node will also go into the 
>> down state and will stay in such a state.
>> 
>> We had a brief look through the code and noticed that the handling of 
>> Zookeeper disconnection is handled differently and believe that the new 
>> change is causing an effect.
>> 
>> We handle our Zookeeper and solr  instance in Kubernetes and are 
>> experiencing this quite regularly due to the fact that kubernetes can 
>> sometimes do rebalancing of workloads of its own accord causing Zookeeper to 
>> be moved or recycled more than once a week. To ensure that this wasn't a 
>> quirk of zookeeper we also replicated this issue in a docker compose 
>> scenario as well and experienced the same.
>> 
>> We went back to v9 and did not experience any issues at all.
>> 
>> Below I will add Docker Compose Template as well as script to replicate this 
>> action.
>> 
>> Docker-compose.yml
>> services:
>> 
>> # ─── ZooKeeper ensemble (3 nodes for quorum) ───────────────────────────────
>> 
>> zookeeper-0:
>>   image: zookeeper:3.9
>>   hostname: zookeeper-0
>>   restart: unless-stopped
>>   environment:
>>     ZOO_MY_ID: "1"
>>     ZOO_SERVERS: "server.1=zookeeper-0:2888:3888;2181 
>> server.2=zookeeper-1:2888:3888;2181 server.3=zookeeper-2:2888:3888;2181"
>>   volumes:
>>     - zk0-data:/data
>>     - zk0-datalog:/datalog
>>   healthcheck:
>>     # AdminServer HTTP endpoint — works regardless of election state; 
>> 4LW/zkServer.sh unreliable in 3.9
>>     test: ["CMD", "wget", "-q", "--spider", 
>> "http://localhost:8080/commands/ruok";]
>>     interval: 5s
>>     timeout: 5s
>>     retries: 20
>>     start_period: 20s
>> 
>> zookeeper-1:
>>   image: zookeeper:3.9
>>   hostname: zookeeper-1
>>   restart: unless-stopped
>>   environment:
>>     ZOO_MY_ID: "2"
>>     ZOO_SERVERS: "server.1=zookeeper-0:2888:3888;2181 
>> server.2=zookeeper-1:2888:3888;2181 server.3=zookeeper-2:2888:3888;2181"
>>   volumes:
>>     - zk1-data:/data
>>     - zk1-datalog:/datalog
>>   healthcheck:
>>     test: ["CMD", "wget", "-q", "--spider", 
>> "http://localhost:8080/commands/ruok";]
>>     interval: 5s
>>     timeout: 5s
>>     retries: 20
>>     start_period: 20s
>> 
>> zookeeper-2:
>>   image: zookeeper:3.9
>>   hostname: zookeeper-2
>>   restart: unless-stopped
>>   environment:
>>     ZOO_MY_ID: "3"
>>     ZOO_SERVERS: "server.1=zookeeper-0:2888:3888;2181 
>> server.2=zookeeper-1:2888:3888;2181 server.3=zookeeper-2:2888:3888;2181"
>>   volumes:
>>     - zk2-data:/data
>>     - zk2-datalog:/datalog
>>   healthcheck:
>>     test: ["CMD", "wget", "-q", "--spider", 
>> "http://localhost:8080/commands/ruok";]
>>     interval: 5s
>>     timeout: 5s
>>     retries: 20
>>     start_period: 20s
>> 
>> # ─── ZK chroot init (creates /solrcloud-test znode, then exits) 
>> ─────────────
>> 
>> zk-init:
>>   image: zookeeper:3.9
>>   restart: "no"
>>   depends_on:
>>     zookeeper-0:
>>       condition: service_healthy
>>   command: >
>>     sh -c "zkCli.sh -server zookeeper-0:2181 create /solrcloud-test '' 2>&1 
>> | tail -1; echo 'ZK chroot ready'"
>>   healthcheck:
>>     test: ["CMD-SHELL", "exit 0"]
>>     interval: 5s
>>     retries: 1
>> 
>> # ─── SolrCloud nodes 
>> ────────────────────────────────────────────────────────
>> 
>> solrcloud-0:
>>   image: solr:10
>>   hostname: solrcloud-0
>>   restart: unless-stopped
>>   environment:
>>     SOLR_SKIP_ROOT_CHECK: "true"
>>     SOLR_PORT: "8983"
>>     SOLR_JAVA_MEM: "-XX:+UseContainerSupport -XX:MaxRAMPercentage=60.0"
>>     # /solrcloud-test chroot isolates this cluster within the ZK ensemble
>>     ZK_HOST: 
>> "zookeeper-0:2181,zookeeper-1:2181,zookeeper-2:2181/solrcloud-test"
>>     SOLR_HOST: "solrcloud-0"
>>     SOLR_LOG_LEVEL: "WARN"
>>     LOG4J_FORMAT_MSG_NO_LOOKUPS: "true"
>>     SOLR_OPTS: "-Dhost=solrcloud-0"
>>   ports:
>>     - "8983:8983"
>>   depends_on:
>>     zookeeper-0:
>>       condition: service_healthy
>>     zookeeper-1:
>>       condition: service_healthy
>>     zookeeper-2:
>>       condition: service_healthy
>>     zk-init:
>>       condition: service_completed_successfully
>>   volumes:
>>     - solr0-data:/var/solr
>>     - ./config/solr-log.xml:/opt/solr/server/resources/log4j2.xml:ro
>>     - ./config/solr-log.xml:/var/solr/log4j2.xml:ro
>>   healthcheck:
>>     test: ["CMD", "curl", "-f", 
>> "http://localhost:8983/solr/admin/info/system";]
>>     interval: 10s
>>     timeout: 5s
>>     retries: 10
>>     start_period: 60s
>> 
>> solrcloud-1:
>>   image: solr:10
>>   hostname: solrcloud-1
>>   restart: unless-stopped
>>   environment:
>>     SOLR_SKIP_ROOT_CHECK: "true"
>>     SOLR_PORT: "8983"
>>     SOLR_JAVA_MEM: "-XX:+UseContainerSupport -XX:MaxRAMPercentage=60.0"
>>     ZK_HOST: 
>> "zookeeper-0:2181,zookeeper-1:2181,zookeeper-2:2181/solrcloud-test"
>>     SOLR_HOST: "solrcloud-1"
>>     SOLR_LOG_LEVEL: "WARN"
>>     LOG4J_FORMAT_MSG_NO_LOOKUPS: "true"
>>     SOLR_OPTS: "-Dhost=solrcloud-1"
>>   ports:
>>     - "8984:8983"
>>   depends_on:
>>     zookeeper-0:
>>       condition: service_healthy
>>     zookeeper-1:
>>       condition: service_healthy
>>     zookeeper-2:
>>       condition: service_healthy
>>     zk-init:
>>       condition: service_completed_successfully
>>   volumes:
>>     - solr1-data:/var/solr
>>     - ./config/solr-log.xml:/opt/solr/server/resources/log4j2.xml:ro
>>     - ./config/solr-log.xml:/var/solr/log4j2.xml:ro
>>   healthcheck:
>>     test: ["CMD", "curl", "-f", 
>> "http://localhost:8983/solr/admin/info/system";]
>>     interval: 10s
>>     timeout: 5s
>>     retries: 10
>>     start_period: 60s
>> 
>> volumes:
>> zk0-data:
>> zk0-datalog:
>> zk1-data:
>> zk1-datalog:
>> zk2-data:
>> zk2-datalog:
>> solr0-data:
>> solr1-data:
>> 
>> Zk-failover-test.ps1
>> 
>> $ErrorActionPreference = "Stop"
>> Set-Location $PSScriptRoot
>> 
>> $ZkContainers = @(
>>   "solrcloud-docker-compose-zookeeper-0-1",
>>   "solrcloud-docker-compose-zookeeper-1-1",
>>   "solrcloud-docker-compose-zookeeper-2-1"
>> )
>> $SolrContainers = @(
>>   "solrcloud-docker-compose-solrcloud-0-1",
>>   "solrcloud-docker-compose-solrcloud-1-1"
>> )
>> $AllHealthChecked = $ZkContainers + $SolrContainers
>> 
>> # ── 1. Start the stack 
>> ────────────────────────────────────────────────────────
>> 
>> Write-Host "`n[1/5] Starting Docker Compose stack..." -ForegroundColor Cyan
>> docker compose up -d
>> if ($LASTEXITCODE -ne 0) { throw "docker compose up failed" }
>> 
>> # ── 2. Wait for all containers to be healthy 
>> ──────────────────────────────────
>> 
>> Write-Host "`n[2/5] Waiting for all containers to be healthy..." 
>> -ForegroundColor Cyan
>> $timeout = 300
>> $elapsed = 0
>> 
>> while ($elapsed -lt $timeout) {
>>   Start-Sleep -Seconds 5
>>   $elapsed += 5
>> 
>>   $statuses = $AllHealthChecked | ForEach-Object {
>>       docker inspect $_ --format "{{.State.Health.Status}}" 2>$null
>>   }
>> 
>>   $unhealthy = ($statuses | Where-Object { $_ -ne "healthy" }).Count
>>   Write-Host "  ${elapsed}s — $($statuses.Count - 
>> $unhealthy)/$($statuses.Count) healthy"
>> 
>>   if ($unhealthy -eq 0) { break }
>> }
>> 
>> if ($elapsed -ge $timeout) { throw "Containers did not become healthy within 
>> ${timeout}s" }
>> Write-Host "  All containers healthy." -ForegroundColor Green
>> 
>> # ── 3. Create 10 Solr collections 
>> ─────────────────────────────────────────────
>> 
>> Write-Host "`n[3/5] Creating 10 Solr collections..." -ForegroundColor Cyan
>> for ($i = 1; $i -le 10; $i++) {
>>   $name = "test-collection-$i"
>>   $uri  = 
>> "http://localhost:8983/solr/admin/collections?action=CREATE&name=$name&numShards=1&replicationFactor=2&wt=json";
>>   try {
>>       $resp = Invoke-RestMethod -Uri $uri -Method Get
>>       $status = $resp.responseHeader.status
>>       Write-Host "  Created $name (status: $status)"
>>   } catch {
>>       Write-Warning "  Failed to create $name`: $_"
>>   }
>> }
>> 
>> # ── 4. Find the ZooKeeper leader 
>> ──────────────────────────────────────────────
>> 
>> Write-Host "`n[4/5] Finding ZooKeeper leader..." -ForegroundColor Cyan
>> $leaderContainer = $null
>> 
>> foreach ($container in $ZkContainers) {
>>   $stat = docker exec $container wget -qO- 
>> "http://localhost:8080/commands/stat"; 2>&1
>>   if ($stat -match '"server_state"\s*:\s*"leader"') {
>>       $leaderContainer = $container
>>       Write-Host "  Leader: $container" -ForegroundColor Yellow
>>       break
>>   }
>> }
>> 
>> if (-not $leaderContainer) { throw "Could not find ZooKeeper leader" }
>> 
>> # ── 5. Restart the leader 
>> ─────────────────────────────────────────────────────
>> 
>> Write-Host "`n[5/5] Restarting ZK leader ($leaderContainer)..." 
>> -ForegroundColor Cyan
>> docker restart $leaderContainer
>> Write-Host "  Restarted. New leader election underway." -ForegroundColor 
>> Green
>> 
>> Write-Host "`nDone." -ForegroundColor Green
>> 
>> 
>> We had a look through solr jira to see if we could see any bugs but could 
>> not see any that match these symptoms.
>> 
>> If this could be raised as bug or someone could advise on some solutions 
>> that would be much appreciated.
>> 
>> Thanks,
>> 
>> 
>> Liam Newton
>> 
>> Email: [email protected]<mailto:[email protected]>
>> 
>> Platform Engineer
>> 
>> Disclaimer
>> 
>> The information contained in this communication from the sender is 
>> confidential. It is intended solely for use by the recipient and others 
>> authorized to receive it. If you are not the recipient, you are hereby 
>> notified that any disclosure, copying, distribution or taking action in 
>> relation of the contents of this information is strictly prohibited and may 
>> be unlawful.
>> 
>> This email has been scanned for viruses and malware, and may have been 
>> automatically archived by Mimecast, a leader in email security and cyber 
>> resilience. Mimecast integrates email defenses with brand protection, 
>> security awareness training, web security, compliance and other essential 
>> capabilities. Mimecast helps protect large and small organizations from 
>> malicious activity, human error and technology failure; and to lead the 
>> movement toward building a more resilient world. To find out more, visit our 
>> website.
> 
> Disclaimer
> 
> The information contained in this communication from the sender is 
> confidential. It is intended solely for use by the recipient and others 
> authorized to receive it. If you are not the recipient, you are hereby 
> notified that any disclosure, copying, distribution or taking action in 
> relation of the contents of this information is strictly prohibited and may 
> be unlawful.
> 
> This email has been scanned for viruses and malware, and may have been 
> automatically archived by Mimecast, a leader in email security and cyber 
> resilience. Mimecast integrates email defenses with brand protection, 
> security awareness training, web security, compliance and other essential 
> capabilities. Mimecast helps protect large and small organizations from 
> malicious activity, human error and technology failure; and to lead the 
> movement toward building a more resilient world. To find out more, visit our 
> website.

Reply via email to