Copilot commented on code in PR #12524: URL: https://github.com/apache/trafficserver/pull/12524#discussion_r2501737715
########## src/iocore/eventsystem/Watchdog.cc: ########## @@ -0,0 +1,100 @@ +/** @file + + A watchdog for event loops + + @section license License + + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + */ + +#include "iocore/eventsystem/Watchdog.h" +#include "iocore/eventsystem/EThread.h" +#include "tscore/Diags.h" +#include "tscore/ink_assert.h" +#include "tscore/ink_thread.h" +#include "tsutil/DbgCtl.h" + +#include <atomic> +#include <chrono> +#include <thread> +#include <functional> + +namespace Watchdog +{ + +DbgCtl dbg_ctl_watchdog("watchdog"); + +Monitor::Monitor(EThread *threads[], size_t n_threads, std::chrono::milliseconds timeout_ms) + : _threads(threads, threads + n_threads), _watchdog_thread{std::bind_front(&Monitor::monitor_loop, this)}, _timeout{timeout_ms} +{ + // Precondition: timeout_ms must be > 0. A timeout of 0 indicates the watchdog is disabled + // and the caller should not instantiate the Monitor (see traffic_server.cc). + ink_assert(timeout_ms.count() > 0); Review Comment: The watchdog thread is started in the constructor's member initializer list before the constructor body executes. This creates a race condition where `monitor_loop()` could access `_threads`, `_timeout`, or `_shutdown` before they are fully initialized. The thread should be started after all member variables are initialized, either at the end of the constructor body or by using a separate `start()` method. ```suggestion : _threads(threads, threads + n_threads), _timeout{timeout_ms} { // Precondition: timeout_ms must be > 0. A timeout of 0 indicates the watchdog is disabled // and the caller should not instantiate the Monitor (see traffic_server.cc). ink_assert(timeout_ms.count() > 0); _watchdog_thread = std::thread(std::bind_front(&Monitor::monitor_loop, this)); ``` ########## src/traffic_server/traffic_server.cc: ########## @@ -2131,6 +2137,14 @@ main(int /* argc ATS_UNUSED */, const char **argv) RecRegisterConfigUpdateCb("proxy.config.dump_mem_info_frequency", init_memory_tracker, nullptr); init_memory_tracker(nullptr, RECD_NULL, RecData(), nullptr); + // Start the watchdog + int watchdog_timeout_ms = RecGetRecordInt("proxy.config.exec_thread.watchdog.timeout_ms").value_or(1000); + if (watchdog_timeout_ms > 0) { + watchdog = std::make_unique<Watchdog::Monitor>(eventProcessor.thread_group[ET_NET]._thread, + static_cast<size_t>(eventProcessor.thread_group[ET_NET]._count), + std::chrono::milliseconds{watchdog_timeout_ms}); Review Comment: The watchdog is created immediately after `eventProcessor.start()` is called (line 2132), but the ET_NET threads may not have fully initialized their heartbeat state yet. Since `heartbeat_state` members are initialized to sentinel values (`time_point::min()` and `seq{0}`), the watchdog should either wait for threads to be ready or the initialization order should be documented to prevent potential timing issues during startup. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
