leonfin commented on PR #7930:
URL: https://github.com/apache/geode/pull/7930#issuecomment-3316539248

   Hi @JinwooHwang for 
SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest, I checked with 
AI.  Here is its analysis
   
   " Why the error shows up after Java 17 upgrade
   
     Based on my investigation, the "could not get remote locator information 
for remote site" error appears after the Java 17 upgrade due to several factors:
   
     1. Container Platform Detection Changes _(<--- this is not the case, we do 
have add opens from what I see)_
   
     Java 17's module system is stricter about accessing internal JDK classes. 
The test uses Docker containers, and Java 17 requires explicit module opens to 
detect container environments properly:
     --add-opens=java.base/jdk.internal.platform=ALL-UNNAMED
     Without proper container detection, network interface discovery and 
routing can behave differently.
   
     2. Network Stack Timing Changes
   
     Java 17 has subtle changes in its networking implementation that affect 
connection establishment timing. The test involves:
     - HAProxy load balancer in Docker
     - Multiple gateway receivers on the same port
     - 50-second connection timeouts
   
     These timing differences can cause the connection attempts to fail at 
different points, making previously silent failures now visible in logs.
   
     3. Docker Networking + Java 17
   
     The test setup uses Docker containers that are "not route-able from the 
host" and rely on HAProxy for load balancing. Java 17's improved container 
awareness paradoxically makes it more sensitive to these complex
     networking setups, potentially trying different connection strategies that 
fail before falling back to working ones.
   
     4. Error Logging Frequency
   
     The error was likely always happening but:
     - Java 17's networking might retry connections differently
     - The error might occur at different phases of test execution
     - The test framework's suspect string detection might have become more 
sensitive
   
     5. Not a Real Problem
   
     The important point is: this isn't actually a failure - it's an expected 
transient condition. The gateway sender successfully connects after retries. 
The error message is just noise from the retry mechanism, which is why adding 
IgnoredException is the correct fix."
   
   It then adds the fix in the beginning of the test methods:
   testPingsToReceiversWithSamePortAndHostnameForSendersReachTheRightReceivers
   testSerialGatewaySenderThreadsConnectToSameReceiver
   
testTwoSendersWithSameIdShouldUseSameValueForEnforceThreadsConnectToSameServer
   testPingsToReceiversWithSamePortAndHostnameForSendersUseOnlyOneMoreConnection
   testPingsToReceiversWithSamePortAndHostnameForSendersReachTheRightReceiver
   as:
   // Ignore the expected error message about not getting remote locator 
information
   IgnoredException.addIgnoredException("could not get remote locator 
information for remote site.*");
   
   If we check other tests like WANRolling* they also have:
   IgnoredException ie =
           IgnoredException.addIgnoredException("could not get remote locator 
information");
   
   Or SerialWANPropagationDUnitTest where in post setup it adds:
   IgnoredException.addIgnoredException("could not get remote locator 
information");
   
   So I'm not sure this is something to worry about and we just need to add the 
exception ignore and that's it?
       
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to