Copilot commented on code in PR #13086:
URL: https://github.com/apache/trafficserver/pull/13086#discussion_r3075545245


##########
src/traffic_server/traffic_server.cc:
##########
@@ -898,6 +898,9 @@ CB_After_Cache_Init()
     // call accept on the ports now that the cache is initialized.
     Note("Enabling listen, cache initialization finished");
     start_HttpProxyServer();
+#if defined(__linux__)
+    unshare_et_net_fd_tables();

Review Comment:
   Calling unshare_et_net_fd_tables() unconditionally on Linux will break the 
default accept thread mode (proxy.config.accept_threads defaults to 1). 
Dedicated accept threads (spawn_thread / DEDICATED) accept() sockets and then 
schedule the resulting VC onto an ET_NET thread; after ET_NET unshares its 
files_struct, those newly accepted FDs won’t exist in the ET_NET thread’s FD 
table, leading to EBADF / intermittent connection failures. Gate this call so 
it only runs when accept happens on ET_NET threads (e.g., accept_threads==0 
and/or exec_thread.listen==1), or change the accept-thread handoff to pass FDs 
(e.g., SCM_RIGHTS) into the destination thread’s FD table.
   ```suggestion
       int accept_threads = 0;
   
       REC_ReadConfigInteger(accept_threads, "proxy.config.accept_threads");
       if (0 == accept_threads) {
         unshare_et_net_fd_tables();
       }
   ```



##########
src/iocore/net/UnixNet.cc:
##########
@@ -200,3 +207,44 @@ initialize_thread_for_net(EThread *thread)
 #endif
   thread->ep = ep;
 }
+
+#if defined(__linux__)
+// Give each ET_NET thread its own kernel FD table (files_struct) so that
+// accept4/close no longer contend on a single shared spinlock.  Every FD
+// that was open at the time of the call is copied into the new private
+// table — the underlying kernel objects (struct file) are shared, so
+// cross-thread eventfd signalling and shared cache-disk FDs keep working.
+//
+// This must be scheduled AFTER start_HttpProxyServer() so that all
+// initialization FDs (eventfds, cache disks, DNS sockets, log files,
+// listen sockets, plugin FDs) are already in place.

Review Comment:
   The comment/doc for this feature is missing an important correctness 
constraint: unsharing the ET_NET files_struct is only safe if accepted sockets 
are created within the ET_NET thread that will use them. With dedicated accept 
threads (accept_threads > 0), sockets are accepted on a different (DEDICATED) 
thread and then scheduled onto ET_NET, which relies on a shared FD table and 
will fail once ET_NET unshares.



##########
src/iocore/net/UnixNet.cc:
##########
@@ -200,3 +207,44 @@ initialize_thread_for_net(EThread *thread)
 #endif
   thread->ep = ep;
 }
+
+#if defined(__linux__)
+// Give each ET_NET thread its own kernel FD table (files_struct) so that
+// accept4/close no longer contend on a single shared spinlock.  Every FD
+// that was open at the time of the call is copied into the new private
+// table — the underlying kernel objects (struct file) are shared, so
+// cross-thread eventfd signalling and shared cache-disk FDs keep working.
+//
+// This must be scheduled AFTER start_HttpProxyServer() so that all
+// initialization FDs (eventfds, cache disks, DNS sockets, log files,
+// listen sockets, plugin FDs) are already in place.
+
+namespace
+{
+struct UnshareFilesCont : public Continuation {
+  int
+  mainEvent(int /* event ATS_UNUSED */, Event * /* e ATS_UNUSED */)
+  {
+    if (unshare(CLONE_FILES) < 0) {
+      Warning("ET_NET thread %d: unshare(CLONE_FILES) failed: %s", 
this_ethread()->id, strerror(errno));
+    } else {
+      Dbg(dbg_ctl_iocore_net, "ET_NET thread %d: FD table unshared", 
this_ethread()->id);
+    }
+    delete this;
+    return EVENT_DONE;
+  }
+
+  UnshareFilesCont() : Continuation(nullptr) { 
SET_HANDLER(&UnshareFilesCont::mainEvent); }
+};
+} // end anonymous namespace
+
+void
+unshare_et_net_fd_tables()
+{
+  int n = eventProcessor.thread_group[ET_NET]._count;
+  for (int i = 0; i < n; i++) {
+    eventProcessor.thread_group[ET_NET]._thread[i]->schedule_imm(new 
UnshareFilesCont());
+  }
+  Note("Scheduled unshare(CLONE_FILES) on %d ET_NET threads", n);
+}

Review Comment:
   unshare_et_net_fd_tables() schedules unshare(CLONE_FILES) on every ET_NET 
thread regardless of whether accept threads are enabled. In the common/default 
configuration (proxy.config.accept_threads > 0), newly accepted sockets are 
created on DEDICATED accept threads and handed off to ET_NET by FD number; 
after unshare, those FDs will not exist in the ET_NET thread’s private table. 
Add a runtime guard (or make this opt-in via a config record) so this only runs 
when accept is performed on ET_NET threads, or rework the accept-thread handoff 
to transfer FDs into the target thread’s table.



##########
src/traffic_server/traffic_server.cc:
##########
@@ -2456,6 +2459,9 @@ main(int /* argc ATS_UNUSED */, const char **argv)
         // In either case we should not delay to accept the ports.
         Dbg(dbg_ctl_http_listen, "Not delaying listen");
         start_HttpProxyServer(); // PORTS_READY_HOOK called from in here
+#if defined(__linux__)
+        unshare_et_net_fd_tables();
+#endif

Review Comment:
   Same issue as above: unsharing ET_NET FD tables here will break 
configurations using dedicated accept threads (proxy.config.accept_threads > 0, 
which is the default on Linux). Those accept threads accept() sockets and hand 
them off to ET_NET via scheduling; with per-thread FD tables the handed-off FD 
number is not valid in the ET_NET thread.
   ```suggestion
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to