zrhoffman opened a new issue #5069: URL: https://github.com/apache/trafficcontrol/issues/5069
<!-- ************ STOP!! ************ If this issue identifies a security vulnerability, DO NOT submit it! Instead, contact the Apache Traffic Control Security Team at [email protected] and follow the guidelines at https://www.apache.org/security/ regarding vulnerability disclosure. - For *SUPPORT QUESTIONS*, use the Traffic Control slack (https://s.apache.org/atc-slack) or Traffic Control mailing lists (https://trafficcontrol.apache.org/mailing_lists). - Before submitting, please **SEARCH GITHUB** for a similar issue or PR. --> ## I'm submitting a ... <!-- delete all those that don't apply --> <!--- security vulnerability (STOP!! - see above)--> - bug report ## Traffic Control components affected ... <!-- delete all those that don't apply --> - Traffic Router ## Current behavior: <!-- Describe how the bug manifests --> When running the Traffic Router tests in `ExternalTestSuite`, SteeringTest fails with ``` Expected: a value greater than <986> but: <986> was equal to <986> at com.comcast.cdn.traffic_control.traffic_router.core.external.SteeringTest.z_itemsMigrateFromSmallerToLargerBucket(SteeringTest.java:356) ``` ## Expected behavior: <!-- Describe what the behavior would be without the bug --> The tests in `SteeringTest` should pass ## Minimal reproduction of the problem with instructions: <!-- If the current behavior is a bug, please provide the *STEPS TO REPRODUCE* and include the applicable TC version. --> Run the external tests. For convenience, I have made a branch that runs `SteeringTest` only in CDN-in-a-Box https://github.com/zrhoffman/trafficcontrol/commits/steering-test You can use it to run the Traffic Router tests, including external tests, as follows: ```shell docker-compose up -d docker-compose -f docker-compose.readiness.yml up readiness # Waits for Traffic Monitor to be usable by Traffic Routers docker-compose -f docker-compose.traffic-router-test.yml up tr-integration-test ``` ## Anything else: <!-- e.g. stacktraces, related issues, suggestions how to fix (feel free to delete this section) --> SteeringTest has failed since 18fe13ac63/#3534 (and snuck into 4.1.0). The only 2 assertions that fail are [these](https://github.com/apache/trafficcontrol/blob/386cf14296/traffic_router/core/src/test/java/com/comcast/cdn/traffic_control/traffic_router/core/external/SteeringTest.java#L356-L357): ```java assertThat(rehashedPaths.get(smallerTarget).size(), greaterThan(hashedPaths.get(smallerTarget).size())); assertThat(rehashedPaths.get(largerTarget).size(), lessThan(hashedPaths.get(largerTarget).size())); ``` I think this is because SteeringWatcher doesn't get to update [after the POST request to the mock TO server to change the steering attributes](https://github.com/apache/trafficcontrol/blob/386cf14296/traffic_router/core/src/test/java/com/comcast/cdn/traffic_control/traffic_router/core/external/SteeringTest.java#L325-L327). The reason I don't think SteeringWatcher updates after that point is that `LetsEncryptDnsChallengeWatcher`'s `ScheduledExecutorService` hangs when trying to update `LetsEncryptDnsChallengeWatcher`'s database [from `https://${toHostname}/api/2.0/letsencrypt/dnsrecords/`](https://github.com/apache/trafficcontrol/blob/386cf14296/traffic_router/core/src/main/java/com/comcast/cdn/traffic_control/traffic_router/core/ds/LetsEncryptDnsChallengeWatcher.java#L38). For whatever reason, `Fetcher`'s request for `LetsEncryptDnsChallengeWatcher` never makes it to HttpDataServer (the mock TO server) even though the host and port seem to line up. But, if you make `Fetcher` return for `LetsEncryptDnsChallengeWatcher` requests right before [it sends the request](https://github.com/apache/trafficcontrol/blob/386cf14296/traffic_router/core/src/main/java/com/comcast/cdn/traffic_control/traffic_router/core/util/Fetcher.java#L135): ```java if (url.contains("letsencrypt")) { return http; } ``` then `SteeringTest` succeeds. If you don't, the request doesn't seem to ever complete. -- Other notes: * Changing `DEFAULT_LE_DNS_CHALLENGE_URL` to `"https://${toHostname}/api/2.0/letsencrypt/dnsrecords"` so it doesn't end in a slash, then [adding a mock response for it](https://github.com/apache/trafficcontrol/tree/386cf14296/traffic_router/core/src/test/resources/api/2.0) does not keep `LetsEncryptDnsChallengeWatcher`'s `Fetcher` request from hanging. * One `AbstractResourceWatcher` hanging should not make another `AbstractResourceWatcher` hang. If Traffic Router does this outside of tests, it's a serious bug that we should fix. <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
