Hi All,

I have a head-scratcher at the moment with our SOLRCloud deployment.

We have several SOLR docker images running on Alpine Linux, deployed as AWS
ECS Fargate tasks

We also have separate Zookeeper linux images running on AWS ECS Fargate
tasks, as a quorum - i.e. 3 working together, orchestrating SOLR leader
elections and holding SOLR configurations for collections

SOLR tasks and Zookeeper tasks resolve their endpoint IP addresses via DNS
(AWS Route53)

As is the way with AWS, they oftentimes schedule forced redeployments of
Fargate services when they update the underlying platforms for patching and
security updates.

We have a situation when the Zookeeper ECS tasks are redeployed, and they
are given new IP addresses, SOLR appears to not realise this and doesn't
seem to be querying DNS for the new addresses of Zookeeper.

I have tested from within SOLR running ECS tasks that they can resolve
Route53 names of other ECS services when they restart,,, which they do.  So
it doesn't appear to be an issue with the Alpine OS that the SOLR images
are using to run the SOLR java app.

I also tried adding the following JVM directives to the Jetty startup
script that the SOLR tasks start with...

-Dsun.net.inetaddr.ttl=0
-Dnetworkaddress.cache.ttl=0

I deployed this change, then after SOLR had come back up and was reporting
all OK, redeployed the Zookeeper ECS tasks.   SOLR behaved the same way, as
if it was caching the IP addresses for ZK somwehere.

We are running version 4.10.2 of SOLR, which we cannot easily upgrade from,
as it would require significant refactoring or our core application that
uses SOLR by our development team

Does anybody know if there are any configuration options within SOLR
itself, or where in the SOLR code it may be caching the ZK IPs?

Any help would be much appreciated!!

Cheers,
Daz
--

Reply via email to