Ok, I think I fixed the docker tests. The other issues all still apply though.
- Houston On Thu, Oct 26, 2023 at 12:16 PM Houston Putman <hous...@apache.org> wrote: > The Jenkins builds aren't in a great state right now. > > Currently the Solr-Check-main > <https://ci-builds.apache.org/job/Solr/job/Solr-Check-main> build is > failing consistently because of random Solr processes being found on the > box (when the integration tests expect nothing else to be running). Now > that we have port randomization for the integration tests, its a very good > sign that the found Solr processes all use port 8983, meaning that we > aren't leaking Solrs in the integration tests. > > Because of this, the culprit seems to be that the smoke tests (which still > start a Solr on port 8983) are leaking processes, and looking at the logs, > that seems to be the case (Solr-Smoketest-9.4 > <https://ci-builds.apache.org/job/Solr/job/Solr-Smoketest-9.4>, > Solr-Smoketest-9.x > <https://ci-builds.apache.org/job/Solr/job/Solr-Smoketest-9.x>). So > fixing the Smoketests leaking Solr processes will in turn fix both the > smoke test builds and the main check. > > As for the Solr-Check-9.x > <https://ci-builds.apache.org/job/Solr/job/Solr-Check-9.x> build, it is > running on Crave, so it doesn't have the same issue with leaked Solr > processes. However on crave, there seems to be an issue with the mTLS > tests. (Solr-Check-main also has this issue, but only on the lucene-solr-1 > machine, not lucene-solr-2 strangely). We need to investigate why the TLS > tests pass locally for everyone (and on 1/2 of the Jenkins boxes), but not > on crave. > > Lastly, the Docker tests are broken in a very strange way. A while ago, I > added tests to make sure that the prometheus exporter can communicate > correctly in docker. This test seems to fail on both > Solr-Docker-Nightly-main > <https://ci-builds.apache.org/job/Solr/job/Solr-Docker-Nightly-main> and > Solr-Docker-Nightly-9.x > <https://ci-builds.apache.org/job/Solr/job/Solr-Docker-Nightly-9.x>. At > first I thought the issue was that the Jenkins servers had different Docker > networking that didn't support these tests, and I let it be for a bit. Now > we are running Solr-Docker-Nightly-9.4 > <https://ci-builds.apache.org/job/Solr/job/Solr-Docker-Nightly-9.4>, > which has the same tests included and it passes. So it does seem like the > Jenkins servers allow us to use Docker networking in the ways we want, but > for some reason 9.x and 9.4 (which should be relatively identical) don't > behave the same way. Looking at the err logs, the problem is > >> /opt/solr/docker/scripts/docker-entrypoint.sh: line 48: exec: >> solr-exporter: not found >> > On the top of my head I think this might be using the slim docker image? > Because otherwise there's no reason why the solr exporter wouldn't be > there... (Also no idea why it wouldn't work the same on the 9.4 build...) > > Anyways, this is just a list of what's going on. I'll try to fix the > docker stuff, but would love help with the other builds! > > - Houston >