[osol-discuss] Is the cyclic sybsystem preemptive?

Ryan Fri, 12 Jun 2009 02:53:33 -0700

Hi all,
<p>
I'm trying to debug a strange watchdog timeout on a T5220 running Solaris 10 
10/08 and am starting to suspect the problem lies at least partly in the cyclic 
subsystem. 
<p>
Here's the scenario: I'm developing an app which monitors its own cpu 
utilization by reading /proc/&lt;pid&gt;/usage. This occurs every 7msec or so, 
driven by a high res timer feeding a port which the utilization monitor thread 
waits on.
<p>
This had been working fine for quite a while now at resolutions ranging from 
1-10msec, but just recently these watchdog timeouts started occurring. After 
changing the watchdog to force an all-pages kernel dump and pointing mdb at the 
result, I discovered that the scheduler was blocked on my app's process lock, 
had missed 277 ticks, and that the watchdog_pat cyclic was scheduled on the 
same CPU; my usage-reading timer was over 11 seconds behind schedule as well. 
<p>
So, some questions:
<p>
1. What is the cyclic backend supposed to do when it gets behind schedule? 
Looking at the code, the cbe doesn't drop to lower interrupt levels until after 
it processes any backlog; maybe there's some sort of priority inversion 
preventing the high-level watchdog from running because the lock-level 
scheduler is blocked and the high-level cyclic timer tick can't clear the 
backlog for some reason?
<p>
2. How could reading /proc/&lt;pid&gt;/usage/ take long enough to prevent the 
kernel scheduler from running for 277 ticks? 7msec isn't *that* often, is it?
<p>
3. Am I doing something wrong? I realize that high-res timers give enough rope 
for an app to impact the whole system, but it seems a bit extreme for a 7msec 
timer event to bring the system down.
<p>
4. If i had used an interrupt instead of a port, could it have caused a global 
deadlock on the process lock? pr_read_usage, the scheduler, and timer_fire 
would all try to grab it.
<p>
More details below...
<p>
Thanks!<br>
Ryan
<p>
Note: I'm not actually sure whether the system becomes unresponsive before the 
timeout, or if it's the watchdog-triggered kernel panic that does it, but I 
think it's the latter.
<p>
<pre>
&gt; $C
000002a10007f201 vpanic(10acbd0, 8, 7df3948, 0, 0, 10ac800)
000002a10007f2b1 process_nonresumable_error+0x33c(2a10007fca0, 0, 5, 40, 0, 0)
000002a10007f3f1 ktl0+0x64(9f398e56b780, 0, 9f39a9292c00, 9f398e7ffcee, a0, 14)
000002a10007f541 4(70010080, 1, 6007b885890, 1, 115, 1000000000000)
000002a10007f5f1 cyclic_fire+0x64(30003578d98, 9f398e56b780, 3b9ac800, 0, 
30001cc4c08, 6007b885840)
000002a10007f6a1 cbe_level14+0x34(0, 0, e, 180c000, 10b0, 100fe98)
000002a10007f751 current_thread+0x164(e151a4019f7f, 14da87b30, 1, 1, 2, 
105f18c50)
000002a100046d61 default_lock_delay+0x6c(0, 15080023e5, 2a100047ca0, 182b000, 
1, 182b000)
<b>000002a100046e11 mutex_vector_enter+0x460(1872380, 2, 60055308d00, 
30005b4ae40, 106d5f8, 0)</b>
000002a100046ec1 clock_tick_process+0xc8(30004b1c1c0, d5bb9, 1, 30004b1c1c0, 
180c000, 60055308d00)
000002a100046f71 clock_tick_execute_common+0x10(0, 30, 40, d5bb9, 1, 1863000)
000002a100047021 clock+0x444(1, 3b9ac9ff, 0, 5, 3, 5)
000002a100047121 cyclic_softint+0xac(6005542c200, 6007b885890, 1, 6005542c180, 
6005af464e0, 6007b885840)
000002a1000471e1 cbe_level10+8(0, 0, 180c000, 2a100047d78, 1, 1011e10)
000002a100047291 intr_thread+0x168(ffffffff72dfa038, 14da87b30, 1, 1, 2, 
105f18c50)
</pre>
<p>
60055308d00 is the process lock for my app; when the watchdog timed out it was 
reading its own /proc/&lt;pid&gt;/usage, holding the process lock:
<p>
<pre>
> 30005b4ae40::findstack 
stack pointer for thread 30005b4ae40: 2a103e56bc1
[ 000002a103e56bc1 panic_idle+0x14() ]
  000002a103e56c71 ktl0+0x64()
  000002a103e56dc1 praddusage+0x38()
  000002a103e57001 pr_read_usage+0x190()
  000002a103e570b1 fop_read+0x20()
  000002a103e57161 pread+0x244()
  000002a103e572e1 syscall_trap+0xac()
> 30005b4ae40::print kthread_t t_procp |::print proc_t p_lockp
p_lockp = 0x60055308d00
</pre>
<p>
Poking around further showed that the watchdog_pat and the scheduler clock 
cyclics were both registered  on CPU0, and the scheduler was nearly 3 seconds 
behind schedule, while the usage reader was nearly 12 seconds behind:
<p>
&gt; ::cycinfo -v
<pre>
CPU     CYC_CPU   STATE NELEMS        ROOT            FIRE HANDLER
  0 6005542c180  online      6 6007b885840    9f398e56b780 clock


                                       2                                      
                                       |                                      
                    +------------------+------------------+                   
                    4                                     0                   
                    |                                     |                   
          +---------+--------+                  +---------+---------+         
          3                  1                  5                             
          |                  |                  |                             
     +----+----+        +----+----+        +----+----+                        
  
         ADDR NDX HEAP LEVL  PEND            FIRE USECINT HANDLER
  6007b885840   0    2 high     0    9f398e56b780   10000 cbe_hres_tick
  6007b885868   1    4 high     0    9f39a9292c00 4375000 watchdog_pat
  6007b885890   2    0 lock   277    9f398e56b780   10000 clock
  6007b8858b8   3    3 high     0    9f39a9292c00 1000000 deadman
  6007b8858e0   4    1  low   279    9f398e7ffcee    9900 dtrace_state_clean
  6007b885908   5    5  low     5    9f39a9292c00 1000000 dtrace_state_deadman

... snip ...

CPU     CYC_CPU   STATE NELEMS        ROOT            FIRE HANDLER
 17 60055701a40  online      2 6007c0c00a0    9f3f44e31fbc clock_highres_fire

                                       1                                      
                                       |                                      
                    +------------------+------------------+                   
                    0                                                         
                    |                                                         
          +---------+--------+                                                
  
         ADDR NDX HEAP LEVL  PEND            FIRE USECINT HANDLER
  6007c0c00a0   0    1 high     0    9f3f7f3c2d6a 1000000 deadman
  6007c0c00c8   1    0  low  1664    9f3f44e31fbc    7071 clock_highres_fire
</pre>
-- 
This message posted from opensolaris.org
_______________________________________________
opensolaris-discuss mailing list
[email protected]

[osol-discuss] Is the cyclic sybsystem preemptive?

Reply via email to