ramayer opened a new issue, #498:
URL: https://github.com/apache/solr-operator/issues/498

   Solr Operator's working great in about half of the Kubernetes environment 
I'm testing; but fails in about the other half.
   
   It fails for me on Ubuntu 22.04 using a kubernetes environment started with:
   
       minikube start
   
   where it seems each Solr instance can communicate with the other two just 
fine, but appears to have a network timeout when it attempts to communicate 
with another shard on the same host.   I can create some collections, but am 
unable to creates any collection that has as many shards as solr pod instances.
   
   It works fine for me on the same Ubuntu 22.04 host using:
   
        minikube start --container-runtime=containerd --cpus 4 
--mount-string=$HOME/proj/kube/persistent_volumes:/mnt/host --mount 
   
   It fails for me on MacOS using a kubernetes environment created with:
   
      colima start --cpu 4 -- memory 8 --kubernetes
   
   where it seems like the zookeeper cluster never reaches a quorum; apparently 
timing out when the second zookeeper node attempts to connect to 
example-solrcloud-zookeeper-client:2181 .  It seems as if colima's kubernetes's 
(I think k3s) default networking is not allowing connections to that service 
until the service is ready (which never seems to happen); but I don't know how 
to debug this further.
   
   It works fine for me on the same MacOS host using a kubernetes environment 
created with:
   
       podman machine init -m 16000 --cpus 4 -v "$HOME:$HOME" --rootful
       podman machine start
       minikube start --driver=podman --cpus 4 --memory 12000 
--profile=minikube-on-podman
   
   It works fine for me on Microsoft Azure's AKS using the instructions 
[here](https://learn.microsoft.com/en-us/azure/developer/terraform/create-k8s-cluster-with-tf-and-aks).
   
   
   In all cases, after creating the Kubernetes environment, I'm attempting to 
create the solr cluster with 
   
       kubectl apply -f 
https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/provider/cloud/deploy.yaml
       kubectl create -f 
https://solr.apache.org/operator/downloads/crds/v0.6.0/all-with-dependencies.yaml
       helm install solr-operator apache-solr/solr-operator --version 0.6.0
       helm install example-solr apache-solr/solr --version 0.6.0 \
         --set image.tag=9.0 \
         --set solrOptions.security.authenticationType="Basic" \
         --set solrOptions.javaMemory="-Xms300m -Xmx300m" \
         --set addressability.external.method=Ingress \
         --set addressability.external.domainName="ing.local.domain" \
         --set addressability.external.useExternalAddress="true" \
         --set ingressOptions.ingressClassName="nginx"
   
   I think most of the failure modes seem to be related to when during the 
startup process Kuberentes exposes enough information (DNS?  IP addresses?) to 
nodes during the startup process -- but I don't quite know Kubernetes 
networking well enough to debug this.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to