Greg Smolyn created TS-1336:
-------------------------------

             Summary: High CPU Usage at idle
                 Key: TS-1336
                 URL: https://issues.apache.org/jira/browse/TS-1336
             Project: Traffic Server
          Issue Type: Bug
          Components: Performance
    Affects Versions: 3.0.2, 3.0.5
         Environment: Ubuntu 12.04 server, amd64, Xenon E5520 (4-core, 16 cores 
with HT)
            Reporter: Greg Smolyn


On this unloaded system, a very basic traffic server instance is using 180% 
CPU, with 3 threads ET_TASK 0, ET_TASK 1, and LOGGING taking up about 60% each.

top -H output:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
                                                                                
                  
10723 traffics  20   0 1960m 113m 4168 R   61  0.4   9:11.27 [ET_TASK 1]        
                                                                                
                   
10722 traffics  20   0 1960m 113m 4168 R   60  0.4   8:41.61 [ET_TASK 0]        
                                                                                
                   
10720 traffics  20   0 1960m 113m 4168 S   59  0.4   8:49.19 [LOGGING]          
                                                                                
                   
   19 root      20   0     0    0    0 R   15  0.0 898:45.74 ksoftirqd/3        
                                                                                
                   
   10 root      20   0     0    0    0 S   15  0.0 930:16.92 ksoftirqd/1        
                                                                                
                   
   27 root      20   0     0    0    0 S   14  0.0 893:18.41 ksoftirqd/5        
                                                                                
                   
   35 root      20   0     0    0    0 S   14  0.0 888:54.41 ksoftirqd/7        
                                                                                
                   
    3 root      20   0     0    0    0 S    8  0.0 942:48.39 ksoftirqd/0        
                                                                                
                   
   15 root      20   0     0    0    0 S    7  0.0 906:40.98 ksoftirqd/2        
                                                                                
                   
   23 root      20   0     0    0    0 S    7  0.0 907:30.33 ksoftirqd/4        
                                                                                
                   
   31 root      20   0     0    0    0 S    7  0.0 898:13.05 ksoftirqd/6        
                                                                                
                   
13530 root      20   0 98.2m 3244 2572 S    1  0.0  29:28.86 flip_server        
                                                                                
                   
 9425 root      20   0 17568 1592 1060 R    0  0.0   0:04.16 top                
                                                                                
                   
10689 traffics  20   0 1960m 113m 4168 S    0  0.4   0:00.54 [ET_NET 5]         
                                                                                
                   
10693 traffics  20   0 1960m 113m 4168 S    0  0.4   0:00.51 [ET_NET 9]         
                                                                                
                   
10701 traffics  20   0 1960m 113m 4168 S    0  0.4   0:00.56 [ET_NET 17]        
                                                                                
                   
10702 traffics  20   0 1960m 113m 4168 S    0  0.4   0:00.53 [ET_NET 18]        
                                                                                
                   
10705 traffics  20   0 1960m 113m 4168 S    0  0.4   0:00.54 [ET_NET 21]        
                                                                                
                   
    1 root      20   0 24328 2256 1344 S    0  0.0   0:02.53 init               
                                                                                
                   
    2 root      20   0     0    0    0 S    0  0.0   0:00.05 kthreadd       


stracing the ET_TASK threads showed a repeating set of calls to futex:
futex(0x946ca4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 255405471, 
{1341604150, 0}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x946ce0, FUTEX_WAKE_PRIVATE, 1)  = 0

I installed the symbols and interrupted the process with GDB... The ET_TASK 
functions were generally looked like this:

(gdb) bt
#0  0x00002b4eb72b20fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#1  0x000000000065cad8 in ink_cond_timedwait (t=0x2b4ed64c4e00, 
mp=0x2b4ed62c3bf8, cp=0x2b4ed62c3c20) at ../../lib/ts/ink_thread.h:273
#2  ProtectedQueue::dequeue_timed (this=0x2b4ed62c3be0, cur_time=<optimized 
out>, timeout=<optimized out>, sleep=<optimized out>) at ProtectedQueue.cc:149
#3  0x000000000065eb0b in EThread::execute (this=0x2b4ed61c3010) at 
UnixEThread.cc:277
#4  0x000000000065cc92 in spawn_thread_internal (a=0x25f4230) at Thread.cc:88
#5  0x00002b4eb72ade9a in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00002b4eb91a44bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00002b4eb72b49fa in __lll_unlock_wake () from 
/lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0  0x00002b4eb72b49fa in __lll_unlock_wake () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00002b4eb72b1104 in _L_unlock_644 () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00002b4eb72b1063 in pthread_mutex_unlock () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#3  0x000000000065c961 in ink_mutex_release (m=0x2b4ed62c3bf8) at 
../../lib/ts/ink_mutex.h:104
#4  ProtectedQueue::dequeue_timed (this=0x2b4ed62c3be0, cur_time=<optimized 
out>, timeout=<optimized out>, sleep=<optimized out>) at ProtectedQueue.cc:151
#5  0x000000000065eb0b in EThread::execute (this=0x2b4ed61c3010) at 
UnixEThread.cc:277
#6  0x000000000065cc92 in spawn_thread_internal (a=0x25f4230) at Thread.cc:88
#7  0x00002b4eb72ade9a in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#8  0x00002b4eb91a44bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#9  0x0000000000000000 in ?? ()
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00002b4eb72b20fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
/lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0  0x00002b4eb72b20fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#1  0x000000000065cad8 in ink_cond_timedwait (t=0x2b4ed64c4e00, 
mp=0x2b4ed62c3bf8, cp=0x2b4ed62c3c20) at ../../lib/ts/ink_thread.h:273
#2  ProtectedQueue::dequeue_timed (this=0x2b4ed62c3be0, cur_time=<optimized 
out>, timeout=<optimized out>, sleep=<optimized out>) at ProtectedQueue.cc:149
#3  0x000000000065eb0b in EThread::execute (this=0x2b4ed61c3010) at 
UnixEThread.cc:277
#4  0x000000000065cc92 in spawn_thread_internal (a=0x25f4230) at Thread.cc:88
#5  0x00002b4eb72ade9a in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00002b4eb91a44bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()
(gdb) c
Continuing.


This is a problem for the servers as [a] it keeps us from running in lower 
power states and [b] makes sysadmins think the servers are under load when they 
aren't.

I am happy to help find out more information and do more debugging.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to