leonfin commented on PR #7930:
URL: https://github.com/apache/geode/pull/7930#issuecomment-3316539248
Hi @JinwooHwang for
SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest, I checked with
AI. Here is its analysis
" Why the error shows up after Java 17 upgrade
Based on my investigation, the "could not get remote locator information
for remote site" error appears after the Java 17 upgrade due to several factors:
1. Container Platform Detection Changes _(<--- this is not the case, we do
have add opens from what I see)_
Java 17's module system is stricter about accessing internal JDK classes.
The test uses Docker containers, and Java 17 requires explicit module opens to
detect container environments properly:
--add-opens=java.base/jdk.internal.platform=ALL-UNNAMED
Without proper container detection, network interface discovery and
routing can behave differently.
2. Network Stack Timing Changes
Java 17 has subtle changes in its networking implementation that affect
connection establishment timing. The test involves:
- HAProxy load balancer in Docker
- Multiple gateway receivers on the same port
- 50-second connection timeouts
These timing differences can cause the connection attempts to fail at
different points, making previously silent failures now visible in logs.
3. Docker Networking + Java 17
The test setup uses Docker containers that are "not route-able from the
host" and rely on HAProxy for load balancing. Java 17's improved container
awareness paradoxically makes it more sensitive to these complex
networking setups, potentially trying different connection strategies that
fail before falling back to working ones.
4. Error Logging Frequency
The error was likely always happening but:
- Java 17's networking might retry connections differently
- The error might occur at different phases of test execution
- The test framework's suspect string detection might have become more
sensitive
5. Not a Real Problem
The important point is: this isn't actually a failure - it's an expected
transient condition. The gateway sender successfully connects after retries.
The error message is just noise from the retry mechanism, which is why adding
IgnoredException is the correct fix."
It then adds the fix in the beginning of the test methods:
testPingsToReceiversWithSamePortAndHostnameForSendersReachTheRightReceivers
testSerialGatewaySenderThreadsConnectToSameReceiver
testTwoSendersWithSameIdShouldUseSameValueForEnforceThreadsConnectToSameServer
testPingsToReceiversWithSamePortAndHostnameForSendersUseOnlyOneMoreConnection
testPingsToReceiversWithSamePortAndHostnameForSendersReachTheRightReceiver
as:
// Ignore the expected error message about not getting remote locator
information
IgnoredException.addIgnoredException("could not get remote locator
information for remote site.*");
If we check other tests like WANRolling* they also have:
IgnoredException ie =
IgnoredException.addIgnoredException("could not get remote locator
information");
Or SerialWANPropagationDUnitTest where in post setup it adds:
IgnoredException.addIgnoredException("could not get remote locator
information");
So I'm not sure this is something to worry about and we just need to add the
exception ignore and that's it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]