[jira] [Work logged] (TS-4806) Fix up event processor thread stacks

ASF GitHub Bot (JIRA) Sun, 04 Sep 2016 17:33:33 -0700

     [ 
https://issues.apache.org/jira/browse/TS-4806?focusedWorklogId=28088&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-28088
 ]


ASF GitHub Bot logged work on TS-4806:
--------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Sep/16 00:33
            Start Date: 05/Sep/16 00:33
    Worklog Time Spent: 10m 
      Work Description: Github user PSUdaemon commented on a diff in the pull 
request:

    https://github.com/apache/trafficserver/pull/956#discussion_r77461624
  
    --- Diff: iocore/eventsystem/UnixEventProcessor.cc ---
    @@ -152,6 +155,59 @@ EventProcessor::start(int n_event_threads, size_t 
stacksize)
     #else
           Debug("iocore_thread", "EThread: %d %s: %d", i, obj_name, 
obj->logical_index);
     #endif // HWLOC_API_VERSION
    +    }
    +#endif // TS_USE_HWLOC
    +
    +    snprintf(thr_name, MAX_THREAD_NAME_LENGTH, "[ET_NET %d]", i);
    +#if TS_USE_HWLOC
    +    if (obj_count > 0) {
    +      hwloc_membind_policy_t mem_policy = HWLOC_MEMBIND_DEFAULT;
    +      hwloc_nodeset_t nodeset           = hwloc_bitmap_alloc();
    +      int num_nodes                     = 0;
    +
    +      hwloc_cpuset_to_nodeset(ink_get_topology(), obj->cpuset, nodeset);
    +      num_nodes = 
hwloc_get_nbobjs_inside_cpuset_by_type(ink_get_topology(), obj->cpuset, 
HWLOC_OBJ_NODE);
    +
    +      if (num_nodes == 1) {
    +        mem_policy = HWLOC_MEMBIND_BIND;
    +      } else if (num_nodes > 1) {
    +        mem_policy = HWLOC_MEMBIND_INTERLEAVE;
    +      }
    +
    +      if (mem_policy != HWLOC_MEMBIND_DEFAULT) {
    +        hwloc_set_membind_nodeset(ink_get_topology(), nodeset, mem_policy, 
HWLOC_MEMBIND_THREAD);
    +      }
    --- End diff --
    
    So each thread has a CPU set that we convert into a node set. Node in this 
case is a NUMA memory node. So the likely case is that it returns one node. But 
if you chose to bind your threads to the `machine` or `system` in 
`records.config` then it might cover multiple nodes. In which case we want to 
interleave.
    
    I don't think this depends at all on how `malloc(3)` chooses to allocate. 
It's all about what memory the OS (kernel/libc) returns for a given thread. So 
we tell the OS, please give this thread memory from a specific NUMA node.
    
    Each thread will likely be on a different node so we need to look at the 
CPU set for each thread. The PU's are also usually interleaved. so PU0 might be 
on socket 0 but PU1 is on socket 1 and then PU2 is back on socket 0. So a 
reasonable CPU set might be PU{0,2,4,6} on a dual socket quad core system. And 
that might then reasonably translate to NUMA node 0.
    
    `BIND` means used the specified node. So when one CPU set falls into one 
NUMA node (the likely and preferred case) then we tell it to just use that one 
node.
    
    Also, I will be happy to comment this up a bit more.


Issue Time Tracking
-------------------

    Worklog Id:     (was: 28088)
    Time Spent: 2.5h  (was: 2h 20m)

> Fix up event processor thread stacks
> ------------------------------------
>
>                 Key: TS-4806
>                 URL: https://issues.apache.org/jira/browse/TS-4806
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Phil Sorber
>            Assignee: Phil Sorber
>             Fix For: 7.0.0
>
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Fix event processor to create stacks on the appropriate numa node and with 
> the appropriate page size. Also, stop using the main thread as ET_NET 0 since 
> we can't control any of these aspects of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work logged] (TS-4806) Fix up event processor thread stacks

Reply via email to