zrhoffman opened a new issue #5182: URL: https://github.com/apache/trafficcontrol/issues/5182
<!-- ************ STOP!! ************ If this issue identifies a security vulnerability, DO NOT submit it! Instead, contact the Apache Traffic Control Security Team at [email protected] and follow the guidelines at https://www.apache.org/security/ regarding vulnerability disclosure. - For *SUPPORT QUESTIONS*, use the Traffic Control slack (https://s.apache.org/atc-slack) or Traffic Control mailing lists (https://trafficcontrol.apache.org/mailing_lists). - Before submitting, please **SEARCH GITHUB** for a similar issue or PR. --> ## I'm submitting a ... <!-- delete all those that don't apply --> <!--- security vulnerability (STOP!! - see above)--> - bug report ## Traffic Control components affected ... <!-- delete all those that don't apply --> - CDN in a Box - Traffic Vault - CI tests ## Current behavior: <!-- Describe how the bug manifests --> In CDN in a Box, Traffic Vault sometimes crashes when starting up with this message: ```erlang [error] <0.325.0> CRASH REPORT Process <0.325.0> with 0 neighbours crashed with reason: no match of right hand value {error,closed} in mochiweb_http:loop/2 line 56 ``` See *Anything else* for context in the logs. This affects our CI because the CDN in a Box Readiness check will sometimes fail due to this (after timing out after 12 minutes of attempts to contact a Delivery Service). ## Expected behavior: <!-- Describe what the behavior would be without the bug --> Traffic Vault should start successfully. The Traffic Vault image should include a health check that attempts to query the `ssl` buck and fails if it cannot. If we do this, we can * Make Traffic Vault exit when the health check fails * Make Traffic Vault restart when it exits which should fix this issue with an additional cost of 1-2 minutes of startup time when the issue occurs. ## Minimal reproduction of the problem with instructions: <!-- If the current behavior is a bug, please provide the *STEPS TO REPRODUCE* and include the applicable TC version. --> 1. Start CDN-in-a-Box and watch for that error in the Traffic Vault logs 2. If Traffic Vault gets all the way to ```erlang <0.2189.0>@yz_fuse:create:73 Creating fuse for search index sslkeys ``` but the error is not there (the `sslkeys` being created does not mean the crash did not occur), `docker-compose restart trafficvault` and try again This error is uncommon to reproduce, so don't feel like you need to reproduce it to fix this issue. ## Anything else: <!-- e.g. stacktraces, related issues, suggestions how to fix (feel free to delete this section) --> The log is in a gist (https://gist.github.com/zrhoffman/58850e41d124b0a49acd3203699661d9) because it's 951 lines <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
