[ 
https://issues.apache.org/jira/browse/TS-4806?focusedWorklogId=28718&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-28718
 ]

ASF GitHub Bot logged work on TS-4806:
--------------------------------------

                Author: ASF GitHub Bot
            Created on: 10/Sep/16 12:28
            Start Date: 10/Sep/16 12:28
    Worklog Time Spent: 10m 
      Work Description: Github user gtenev commented on the issue:

    https://github.com/apache/trafficserver/pull/956
  
    @PSUdaemon, the patch looks good, +1 on fixing the sleep(1).
    
    Built and run it in production with ``exec_thread.affinity: 1`` and it run 
fine. 
    
    Here some of the things I checked.
    
    ``numastat`` output looked pretty much the same like before the patch was 
applied
    
    ```
    $ sudo numastat -p $(pgrep -f traffic_server)
    
    Per-node process memory usage (in MBs) for PID 31075 ([TS_MAIN])
                               Node 0          Node 1           Total
                      --------------- --------------- ---------------
    Huge                         0.00            0.00            0.00
    Heap                         0.00            0.00            0.00
    Stack                        1.52            0.94            2.46
    Private                 118751.97       119729.43       238481.40
    ----------------  --------------- --------------- ---------------
    Total                   118753.49       119730.37       238483.86
    ```
    
    "Stop using the main thread as ET_NET 0 ..." (from the Jira)
    Here is a new thread showing now: TS_MAIN
    
    ```
    $ ps -e -T -o 'pid,ucmd'|grep $(pgrep -f traffic_server)|cut -d" " -f2|sort 
|uniq -c
          5 [ACCEPT
         24 [ET_AIO
         24 [ET_NET
          1 [ET_OCSP
          2 [ET_TASK
          1 [LOG_FLUSH]
          1 [LOG_PREPROC
          2 traffic_server
          1 [TS_MAIN]
    ```
    
    ``traffic_server`` ``ET_NET`` threads are distributed evenly over the 2 
NUMA nodes on the machine where I tested (running on nodesets and bound to the 
corresponding cpusets as expected)
    
    ```
    $  sudo lstopo --top --no-io -.xml|grep traffic_server|awk '{match($12, 
/(.*)"/, a); printf("%s %s ", $6, $9); system("ps -e -T -o pid,spid,ucmd|grep " 
a[1]);}'
    allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31101 
[ET_NET 22]
    allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31099 
[ET_NET 20]
    allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31097 
[ET_NET 18]
    allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31095 
[ET_NET 16]
    allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31093 
[ET_NET 14]
    allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31091 
[ET_NET 12]
    allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31089 
[ET_NET 10]
    allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31087 
[ET_NET 8]
    allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31085 
[ET_NET 6]
    allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31083 
[ET_NET 4]
    allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31081 
[ET_NET 2]
    allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31079 
[ET_NET 0]
    allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075 
31102 [ET_NET 23]
    allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075 
31100 [ET_NET 21]
    allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075 
31098 [ET_NET 19]
    allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075 
31096 [ET_NET 17]
    allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075 
31094 [ET_NET 15]
    allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075 
31092 [ET_NET 13]
    allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075 
31090 [ET_NET 11]
    allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075 
31088 [ET_NET 9]
    allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075 
31086 [ET_NET 7]
    allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075 
31084 [ET_NET 5]
    allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075 
31082 [ET_NET 3]
    allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075 
31080 [ET_NET 1]
    ```
    
    
    The NUMA policy is ``preferred=node0`` and ``preferred=node1`` for the 
corresponding stack segments.
    
    ```
    $ for pid in `ps -e -T -o 'spid,ucmd'|grep ET_NET |cut -d" " -f1 `; do sudo 
grep stack /proc/${pid}/numa_maps; done| awk '{match($3, /.*:(.*)/, a); 
printf("%s ", $2); system("ps -e -T -o pid,spid,ucmd|grep " a[1])}' |sort -u
    . . .
    prefer:0 31075 31079 [ET_NET 0]
    prefer:0 31075 31080 [ET_NET 1]
    prefer:0 31075 31081 [ET_NET 2]
    prefer:0 31075 31082 [ET_NET 3]
    . . .
    prefer:0 31075 31100 [ET_NET 21]
    prefer:0 31075 31101 [ET_NET 22]
    prefer:0 31075 31102 [ET_NET 23]
    . . .
    prefer:1 31075 31079 [ET_NET 0]
    prefer:1 31075 31080 [ET_NET 1]
    prefer:1 31075 31081 [ET_NET 2]
    prefer:1 31075 31082 [ET_NET 3]
    . . .
    prefer:1 31075 31100 [ET_NET 21]
    prefer:1 31075 31101 [ET_NET 22]
    prefer:1 31075 31102 [ET_NET 23]
    ```
    
    Checked a few ``ET_NET`` stack segment sizes and they look OK 
    (as configured by ``proxy.config.thread.default.stacksize: 1048576``) 
    
    ```
    $ for pid in `ps -e -T -o 'spid,ucmd'|grep ET_NET |cut -d" " -f1 `; do sudo 
grep stack /proc/${pid}/maps; done| awk '{print $1}' |head
    2aaaaaf66000-2aaaab066000
    2aaaab76b000-2aaaab86b000
    2aaab0c01000-2aaab0e01000
    2aaab0e02000-2aaab0f02000
    2aaab160f000-2aaab170f000
    2aaab4001000-2aaab4101000
    2aaab4102000-2aaab4202000
    2aaab4203000-2aaab4303000
    2aaab4801000-2aaab4901000
    2aaab4902000-2aaab4a02000
    ```


Issue Time Tracking
-------------------

    Worklog Id:     (was: 28718)
    Time Spent: 4h 10m  (was: 4h)

> Fix up event processor thread stacks
> ------------------------------------
>
>                 Key: TS-4806
>                 URL: https://issues.apache.org/jira/browse/TS-4806
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Phil Sorber
>            Assignee: Phil Sorber
>             Fix For: 7.0.0
>
>          Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Fix event processor to create stacks on the appropriate numa node and with 
> the appropriate page size. Also, stop using the main thread as ET_NET 0 since 
> we can't control any of these aspects of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to