[ 
https://issues.apache.org/jira/browse/TS-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286338#comment-14286338
 ] 

Leif Hedstrom commented on TS-1336:
-----------------------------------

I spoke with God (jplevyak) on this issue recently. One suspicion here is that 
we're doing things either at too short timeout intervals, or scheduling things 
too frequently. Definitely worth looking into, and also, maybe TS-3313 will 
help with some of these issues.


> High CPU Usage at idle
> ----------------------
>
>                 Key: TS-1336
>                 URL: https://issues.apache.org/jira/browse/TS-1336
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Performance
>    Affects Versions: 3.0.5, 3.0.2
>         Environment: Ubuntu 12.04 server, amd64, Xenon E5520 (4-core, 16 
> cores with HT)
>            Reporter: Greg Smolyn
>              Labels: A
>             Fix For: sometime
>
>
> On this unloaded system, a very basic traffic server instance is using 180% 
> CPU, with 3 threads ET_TASK 0, ET_TASK 1, and LOGGING taking up about 60% 
> each.
> top -H output:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND          
>                                                                               
>                       
> 10723 traffics  20   0 1960m 113m 4168 R   61  0.4   9:11.27 [ET_TASK 1]      
>                                                                               
>                        
> 10722 traffics  20   0 1960m 113m 4168 R   60  0.4   8:41.61 [ET_TASK 0]      
>                                                                               
>                        
> 10720 traffics  20   0 1960m 113m 4168 S   59  0.4   8:49.19 [LOGGING]        
>                                                                               
>                        
>    19 root      20   0     0    0    0 R   15  0.0 898:45.74 ksoftirqd/3      
>                                                                               
>                        
>    10 root      20   0     0    0    0 S   15  0.0 930:16.92 ksoftirqd/1      
>                                                                               
>                        
>    27 root      20   0     0    0    0 S   14  0.0 893:18.41 ksoftirqd/5      
>                                                                               
>                        
>    35 root      20   0     0    0    0 S   14  0.0 888:54.41 ksoftirqd/7      
>                                                                               
>                        
>     3 root      20   0     0    0    0 S    8  0.0 942:48.39 ksoftirqd/0      
>                                                                               
>                        
>    15 root      20   0     0    0    0 S    7  0.0 906:40.98 ksoftirqd/2      
>                                                                               
>                        
>    23 root      20   0     0    0    0 S    7  0.0 907:30.33 ksoftirqd/4      
>                                                                               
>                        
>    31 root      20   0     0    0    0 S    7  0.0 898:13.05 ksoftirqd/6      
>                                                                               
>                        
> 13530 root      20   0 98.2m 3244 2572 S    1  0.0  29:28.86 flip_server      
>                                                                               
>                        
>  9425 root      20   0 17568 1592 1060 R    0  0.0   0:04.16 top              
>                                                                               
>                        
> 10689 traffics  20   0 1960m 113m 4168 S    0  0.4   0:00.54 [ET_NET 5]       
>                                                                               
>                        
> 10693 traffics  20   0 1960m 113m 4168 S    0  0.4   0:00.51 [ET_NET 9]       
>                                                                               
>                        
> 10701 traffics  20   0 1960m 113m 4168 S    0  0.4   0:00.56 [ET_NET 17]      
>                                                                               
>                        
> 10702 traffics  20   0 1960m 113m 4168 S    0  0.4   0:00.53 [ET_NET 18]      
>                                                                               
>                        
> 10705 traffics  20   0 1960m 113m 4168 S    0  0.4   0:00.54 [ET_NET 21]      
>                                                                               
>                        
>     1 root      20   0 24328 2256 1344 S    0  0.0   0:02.53 init             
>                                                                               
>                        
>     2 root      20   0     0    0    0 S    0  0.0   0:00.05 kthreadd       
> stracing the ET_TASK threads showed a repeating set of calls to futex:
> futex(0x946ca4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 255405471, 
> {1341604150, 0}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
> futex(0x946ce0, FUTEX_WAKE_PRIVATE, 1)  = 0
> I installed the symbols and interrupted the process with GDB... The ET_TASK 
> functions were generally looked like this:
> (gdb) bt
> #0  0x00002b4eb72b20fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #1  0x000000000065cad8 in ink_cond_timedwait (t=0x2b4ed64c4e00, 
> mp=0x2b4ed62c3bf8, cp=0x2b4ed62c3c20) at ../../lib/ts/ink_thread.h:273
> #2  ProtectedQueue::dequeue_timed (this=0x2b4ed62c3be0, cur_time=<optimized 
> out>, timeout=<optimized out>, sleep=<optimized out>) at ProtectedQueue.cc:149
> #3  0x000000000065eb0b in EThread::execute (this=0x2b4ed61c3010) at 
> UnixEThread.cc:277
> #4  0x000000000065cc92 in spawn_thread_internal (a=0x25f4230) at Thread.cc:88
> #5  0x00002b4eb72ade9a in start_thread () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #6  0x00002b4eb91a44bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
> #7  0x0000000000000000 in ?? ()
> (gdb) c
> Continuing.
> ^C
> Program received signal SIGINT, Interrupt.
> 0x00002b4eb72b49fa in __lll_unlock_wake () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> (gdb) bt
> #0  0x00002b4eb72b49fa in __lll_unlock_wake () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #1  0x00002b4eb72b1104 in _L_unlock_644 () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #2  0x00002b4eb72b1063 in pthread_mutex_unlock () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #3  0x000000000065c961 in ink_mutex_release (m=0x2b4ed62c3bf8) at 
> ../../lib/ts/ink_mutex.h:104
> #4  ProtectedQueue::dequeue_timed (this=0x2b4ed62c3be0, cur_time=<optimized 
> out>, timeout=<optimized out>, sleep=<optimized out>) at ProtectedQueue.cc:151
> #5  0x000000000065eb0b in EThread::execute (this=0x2b4ed61c3010) at 
> UnixEThread.cc:277
> #6  0x000000000065cc92 in spawn_thread_internal (a=0x25f4230) at Thread.cc:88
> #7  0x00002b4eb72ade9a in start_thread () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #8  0x00002b4eb91a44bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
> #9  0x0000000000000000 in ?? ()
> (gdb) c
> Continuing.
> ^C
> Program received signal SIGINT, Interrupt.
> 0x00002b4eb72b20fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> (gdb) bt
> #0  0x00002b4eb72b20fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #1  0x000000000065cad8 in ink_cond_timedwait (t=0x2b4ed64c4e00, 
> mp=0x2b4ed62c3bf8, cp=0x2b4ed62c3c20) at ../../lib/ts/ink_thread.h:273
> #2  ProtectedQueue::dequeue_timed (this=0x2b4ed62c3be0, cur_time=<optimized 
> out>, timeout=<optimized out>, sleep=<optimized out>) at ProtectedQueue.cc:149
> #3  0x000000000065eb0b in EThread::execute (this=0x2b4ed61c3010) at 
> UnixEThread.cc:277
> #4  0x000000000065cc92 in spawn_thread_internal (a=0x25f4230) at Thread.cc:88
> #5  0x00002b4eb72ade9a in start_thread () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #6  0x00002b4eb91a44bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
> #7  0x0000000000000000 in ?? ()
> (gdb) c
> Continuing.
> This is a problem for the servers as [a] it keeps us from running in lower 
> power states and [b] makes sysadmins think the servers are under load when 
> they aren't.
> I am happy to help find out more information and do more debugging.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to