The Jenkins builds aren't in a great state right now.

Currently the Solr-Check-main
<https://ci-builds.apache.org/job/Solr/job/Solr-Check-main> build is
failing consistently because of random Solr processes being found on the
box (when the integration tests expect nothing else to be running). Now
that we have port randomization for the integration tests, its a very good
sign that the found Solr processes all use port 8983, meaning that we
aren't leaking Solrs in the integration tests.

Because of this, the culprit seems to be that the smoke tests (which still
start a Solr on port 8983) are leaking processes, and looking at the logs,
that seems to be the case (Solr-Smoketest-9.4
<https://ci-builds.apache.org/job/Solr/job/Solr-Smoketest-9.4>,
Solr-Smoketest-9.x
<https://ci-builds.apache.org/job/Solr/job/Solr-Smoketest-9.x>). So fixing
the Smoketests leaking Solr processes will in turn fix both the smoke test
builds and the main check.

As for the Solr-Check-9.x
<https://ci-builds.apache.org/job/Solr/job/Solr-Check-9.x> build, it is
running on Crave, so it doesn't have the same issue with leaked Solr
processes. However on crave, there seems to be an issue with the mTLS
tests. (Solr-Check-main also has this issue, but only on the lucene-solr-1
machine, not lucene-solr-2 strangely). We need to investigate why the TLS
tests pass locally for everyone (and on 1/2 of the Jenkins boxes), but not
on crave.

Lastly, the Docker tests are broken in a very strange way. A while ago, I
added tests to make sure that the prometheus exporter can communicate
correctly in docker. This test seems to fail on both
Solr-Docker-Nightly-main
<https://ci-builds.apache.org/job/Solr/job/Solr-Docker-Nightly-main> and
Solr-Docker-Nightly-9.x
<https://ci-builds.apache.org/job/Solr/job/Solr-Docker-Nightly-9.x>. At
first I thought the issue was that the Jenkins servers had different Docker
networking that didn't support these tests, and I let it be for a bit. Now
we are running Solr-Docker-Nightly-9.4
<https://ci-builds.apache.org/job/Solr/job/Solr-Docker-Nightly-9.4>, which
has the same tests included and it passes. So it does seem like the Jenkins
servers allow us to use Docker networking in the ways we want, but for some
reason 9.x and 9.4 (which should be relatively identical) don't behave the
same way. Looking at the err logs, the problem is

> /opt/solr/docker/scripts/docker-entrypoint.sh: line 48: exec:
> solr-exporter: not found
>
On the top of my head I think this might be using the slim docker image?
Because otherwise there's no reason why the solr exporter wouldn't be
there... (Also no idea why it wouldn't work the same on the 9.4 build...)

Anyways, this is just a list of what's going on. I'll try to fix the docker
stuff, but would love help with the other builds!

- Houston

Reply via email to