Copilot commented on code in PR #12996:
URL: https://github.com/apache/trafficserver/pull/12996#discussion_r2956137842
##########
src/iocore/cache/CacheDir.cc:
##########
@@ -958,20 +959,42 @@ Directory::entries_used()
}
/*
- * this function flushes the cache meta data to disk when
+ * This function flushes the cache meta data to disk when
* the cache is shutdown. Must *NOT* be used during regular
- * operation.
+ * operation. Stripes are synced in parallel, one thread per
+ * physical disk.
*/
void
sync_cache_dir_on_shutdown()
{
- Dbg(dbg_ctl_cache_dir_sync, "sync started");
- EThread *t = reinterpret_cast<EThread *>(0xdeadbeef);
+ Dbg(dbg_ctl_cache_dir_sync, "shutdown sync started");
+
+ std::unordered_map<CacheDisk *, std::vector<int>> drive_stripe_map;
+
for (int i = 0; i < gnstripes; i++) {
- gstripes[i]->shutdown(t);
+ drive_stripe_map[gstripes[i]->disk].push_back(i);
}
- Dbg(dbg_ctl_cache_dir_sync, "sync done");
+
+ std::vector<std::thread> threads;
+
+ threads.reserve(drive_stripe_map.size());
+ for (auto &[disk, indices] : drive_stripe_map) {
+ Dbg(dbg_ctl_cache_dir_sync, "Disk %s: syncing %zu stripe(s)", disk->path,
indices.size());
+ threads.emplace_back([&indices]() {
+ EThread *t = reinterpret_cast<EThread *>(0xdeadbeef);
+
+ for (int idx : indices) {
Review Comment:
The worker lambda captures the range-for binding `indices` by reference. In
a range-based for loop this means every thread can end up referencing the same
`indices` variable as it gets reassigned each iteration, causing threads to
sync the wrong stripes (often the last disk’s stripes) and potentially run
shutdown on the same stripe concurrently.
Capture the per-iteration value safely (e.g., copy `indices` into the
lambda, or capture a `std::reference_wrapper`/pointer computed per iteration)
so each thread operates on the intended stripe list.
##########
src/iocore/cache/CacheDir.cc:
##########
@@ -958,20 +959,42 @@ Directory::entries_used()
}
/*
- * this function flushes the cache meta data to disk when
+ * This function flushes the cache meta data to disk when
* the cache is shutdown. Must *NOT* be used during regular
- * operation.
+ * operation. Stripes are synced in parallel, one thread per
+ * physical disk.
*/
void
sync_cache_dir_on_shutdown()
{
- Dbg(dbg_ctl_cache_dir_sync, "sync started");
- EThread *t = reinterpret_cast<EThread *>(0xdeadbeef);
+ Dbg(dbg_ctl_cache_dir_sync, "shutdown sync started");
+
+ std::unordered_map<CacheDisk *, std::vector<int>> drive_stripe_map;
+
for (int i = 0; i < gnstripes; i++) {
- gstripes[i]->shutdown(t);
+ drive_stripe_map[gstripes[i]->disk].push_back(i);
}
- Dbg(dbg_ctl_cache_dir_sync, "sync done");
+
+ std::vector<std::thread> threads;
+
+ threads.reserve(drive_stripe_map.size());
+ for (auto &[disk, indices] : drive_stripe_map) {
+ Dbg(dbg_ctl_cache_dir_sync, "Disk %s: syncing %zu stripe(s)", disk->path,
indices.size());
+ threads.emplace_back([&indices]() {
+ EThread *t = reinterpret_cast<EThread *>(0xdeadbeef);
+
+ for (int idx : indices) {
+ gstripes[idx]->shutdown(t);
+ }
Review Comment:
All shutdown threads pass the same dummy `EThread*` value (0xdeadbeef) into
`StripeSM::shutdown()`. `MUTEX_TAKE_LOCK()` uses that pointer to decide whether
to actually acquire the mutex; if two OS threads ever hit the same `ProxyMutex`
with the same `EThread*`, the second thread will skip locking entirely because
`thread_holding == t`.
To keep locking correct, use a distinct thread identity per worker (or avoid
needing a fake `EThread*` by running the shutdown sync work on ATS `ET_TASK`
threads and passing `this_ethread()`).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]