ywkaras commented on a change in pull request #7281:
URL: https://github.com/apache/trafficserver/pull/7281#discussion_r527264937



##########
File path: plugins/s3_auth/s3_auth.cc
##########
@@ -496,40 +598,56 @@ ConfigCache::get(const char *fname)
   auto it = _cache.find(config_fname);
 
   if (it != _cache.end()) {
-    if (tv.tv_sec > (it->second.second + _ttl)) {
-      // Update the cached configuration file.
-      S3Config *s3 = new S3Config(false); // false == this config does not get 
the continuation
-
-      TSDebug(PLUGIN_NAME, "Configuration from %s is stale, reloading", 
config_fname.c_str());
-      it->second.second = tv.tv_sec;
-      if (s3->parse_config(config_fname)) {
-        it->second.first = s3;
+    unsigned update_status = it->second.update_status;
+    if (tv.tv_sec > (it->second.load_time + _ttl)) {
+      if (!(update_status & 1) && 
it->second.update_status.compare_exchange_strong(update_status, update_status + 
1)) {
+        TSDebug(PLUGIN_NAME, "Configuration from %s is stale, reloading", 
config_fname.c_str());
+        s3 = new S3Config(false); // false == this config does not get the 
continuation
+
+        if (s3->parse_config(config_fname)) {
+          s3->set_conf_fname(fname);
+        } else {
+          // Failed the configuration parse... Set the cache response to 
nullptr
+          delete s3;
+          s3 = nullptr;
+          TSAssert(!"Configuration parsing / caching failed");
+        }
+
+        delete it->second.config;
+        it->second.config    = s3;
+        it->second.load_time = tv.tv_sec;
+
+        // Update is complete.
+        ++it->second.update_status;
       } else {
-        // Failed the configuration parse... Set the cache response to nullptr
-        delete s3;
-        it->second.first = nullptr;
+        // This thread lost the race with another thread that is also reloading
+        // the config for this file. Wait for the other thread to finish 
reloading.
+        while (it->second.update_status & 1) {
+          // Hopefully yielding will sleep the thread at least until the next
+          // scheduler interrupt, preventing a busy wait.
+          std::this_thread::yield();
+        }
+        s3 = it->second.config;

Review comment:
       Unfortunately, as much as I have already tortured poor Fei about this 
PR, I think there is still a remaining hypothetical race condition.  Suppose a 
thread executes through line 630, and is preempted at some point while the 
value in s3 is still in use.  While it is not running, suppose a second thread 
executes line 602, with a tv.tv_sec value such that the condition for the if 
statement is true.  The second thread then continues on and executes line 616, 
before the first thread runs again.  When the first thread is run again, it 
will do a use-after-free.
   
   Since this seems extremely unlikely, I think we should go ahead and merge 
this PR, and then see how/if to fix this concern.
   
   If we were using C++20, we could simply make ConfigCache::get() return a 
shared_ptr, and make the config field in _Data an atomic shared_ptr.  How soon 
are we going to 20?
   
   Barring that, we can make the shared_ptr approach work by replacing 
update_status with a mutex.  This would have the added benefit of getting rid 
of a lot tricky logic I asked Fei to put in this function.  (Note that this 
function never creates new map entries when the map is being accessed by 
multiple threads, so we only have to lock the mutex when accessing an existing 
map entry.). Since that could potential be a fair number of mutexes, but I 
think they would be low-contention, I suggest using a "slimmed-down" 
alternative mutex:  https://github.com/apache/trafficserver/pull/7330




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to