[
https://issues.apache.org/jira/browse/TS-4806?focusedWorklogId=28718&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-28718
]
ASF GitHub Bot logged work on TS-4806:
--------------------------------------
Author: ASF GitHub Bot
Created on: 10/Sep/16 12:28
Start Date: 10/Sep/16 12:28
Worklog Time Spent: 10m
Work Description: Github user gtenev commented on the issue:
https://github.com/apache/trafficserver/pull/956
@PSUdaemon, the patch looks good, +1 on fixing the sleep(1).
Built and run it in production with ``exec_thread.affinity: 1`` and it run
fine.
Here some of the things I checked.
``numastat`` output looked pretty much the same like before the patch was
applied
```
$ sudo numastat -p $(pgrep -f traffic_server)
Per-node process memory usage (in MBs) for PID 31075 ([TS_MAIN])
Node 0 Node 1 Total
--------------- --------------- ---------------
Huge 0.00 0.00 0.00
Heap 0.00 0.00 0.00
Stack 1.52 0.94 2.46
Private 118751.97 119729.43 238481.40
---------------- --------------- --------------- ---------------
Total 118753.49 119730.37 238483.86
```
"Stop using the main thread as ET_NET 0 ..." (from the Jira)
Here is a new thread showing now: TS_MAIN
```
$ ps -e -T -o 'pid,ucmd'|grep $(pgrep -f traffic_server)|cut -d" " -f2|sort
|uniq -c
5 [ACCEPT
24 [ET_AIO
24 [ET_NET
1 [ET_OCSP
2 [ET_TASK
1 [LOG_FLUSH]
1 [LOG_PREPROC
2 traffic_server
1 [TS_MAIN]
```
``traffic_server`` ``ET_NET`` threads are distributed evenly over the 2
NUMA nodes on the machine where I tested (running on nodesets and bound to the
corresponding cpusets as expected)
```
$ sudo lstopo --top --no-io -.xml|grep traffic_server|awk '{match($12,
/(.*)"/, a); printf("%s %s ", $6, $9); system("ps -e -T -o pid,spid,ucmd|grep "
a[1]);}'
allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31101
[ET_NET 22]
allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31099
[ET_NET 20]
allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31097
[ET_NET 18]
allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31095
[ET_NET 16]
allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31093
[ET_NET 14]
allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31091
[ET_NET 12]
allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31089
[ET_NET 10]
allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31087
[ET_NET 8]
allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31085
[ET_NET 6]
allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31083
[ET_NET 4]
allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31081
[ET_NET 2]
allowed_cpuset="0x3ff003ff" allowed_nodeset="0x00000001" 31075 31079
[ET_NET 0]
allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075
31102 [ET_NET 23]
allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075
31100 [ET_NET 21]
allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075
31098 [ET_NET 19]
allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075
31096 [ET_NET 17]
allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075
31094 [ET_NET 15]
allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075
31092 [ET_NET 13]
allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075
31090 [ET_NET 11]
allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075
31088 [ET_NET 9]
allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075
31086 [ET_NET 7]
allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075
31084 [ET_NET 5]
allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075
31082 [ET_NET 3]
allowed_cpuset="0x000000ff,0xc00ffc00" allowed_nodeset="0x00000002" 31075
31080 [ET_NET 1]
```
The NUMA policy is ``preferred=node0`` and ``preferred=node1`` for the
corresponding stack segments.
```
$ for pid in `ps -e -T -o 'spid,ucmd'|grep ET_NET |cut -d" " -f1 `; do sudo
grep stack /proc/${pid}/numa_maps; done| awk '{match($3, /.*:(.*)/, a);
printf("%s ", $2); system("ps -e -T -o pid,spid,ucmd|grep " a[1])}' |sort -u
. . .
prefer:0 31075 31079 [ET_NET 0]
prefer:0 31075 31080 [ET_NET 1]
prefer:0 31075 31081 [ET_NET 2]
prefer:0 31075 31082 [ET_NET 3]
. . .
prefer:0 31075 31100 [ET_NET 21]
prefer:0 31075 31101 [ET_NET 22]
prefer:0 31075 31102 [ET_NET 23]
. . .
prefer:1 31075 31079 [ET_NET 0]
prefer:1 31075 31080 [ET_NET 1]
prefer:1 31075 31081 [ET_NET 2]
prefer:1 31075 31082 [ET_NET 3]
. . .
prefer:1 31075 31100 [ET_NET 21]
prefer:1 31075 31101 [ET_NET 22]
prefer:1 31075 31102 [ET_NET 23]
```
Checked a few ``ET_NET`` stack segment sizes and they look OK
(as configured by ``proxy.config.thread.default.stacksize: 1048576``)
```
$ for pid in `ps -e -T -o 'spid,ucmd'|grep ET_NET |cut -d" " -f1 `; do sudo
grep stack /proc/${pid}/maps; done| awk '{print $1}' |head
2aaaaaf66000-2aaaab066000
2aaaab76b000-2aaaab86b000
2aaab0c01000-2aaab0e01000
2aaab0e02000-2aaab0f02000
2aaab160f000-2aaab170f000
2aaab4001000-2aaab4101000
2aaab4102000-2aaab4202000
2aaab4203000-2aaab4303000
2aaab4801000-2aaab4901000
2aaab4902000-2aaab4a02000
```
Issue Time Tracking
-------------------
Worklog Id: (was: 28718)
Time Spent: 4h 10m (was: 4h)
> Fix up event processor thread stacks
> ------------------------------------
>
> Key: TS-4806
> URL: https://issues.apache.org/jira/browse/TS-4806
> Project: Traffic Server
> Issue Type: Improvement
> Components: Core
> Reporter: Phil Sorber
> Assignee: Phil Sorber
> Fix For: 7.0.0
>
> Time Spent: 4h 10m
> Remaining Estimate: 0h
>
> Fix event processor to create stacks on the appropriate numa node and with
> the appropriate page size. Also, stop using the main thread as ET_NET 0 since
> we can't control any of these aspects of it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)