From: Joel Sherrill []
Sent: Thursday, October 13, 2016 17:38
To: Jakob Viketoft
Subject: Re: Time spent in ticks...

>I don't have an or1k handy so ran on a sparc/erc32 simulator/
>It is is a SPARC v7 at 15 Mhz.

>These times are in microseconds and based on the tmtests.
>Specifically tm08and tm27.

>(1) rtems_clock_tick: only case - 52
>(2) rtems interrupt: entry overhead returns to interrupted task - 12
>(3) rtems interrupt: exit overhead returns to interrupted task - 4
>(4) rtems interrupt: entry overhead returns to nested interrupt - 11
>(5) rtems interrupt: exit overhead returns to nested interrupt - 3

>The clock tick test has 100 tasks but it looks like they are blocked on a 
>without timeout.

>Your times look WAY too high. Maybe the interrupt is stuck on and
>not being cleared.

>On the erc32, a nominal "nothing to do clock tick" would be 1+2+3 from
>above or 52+12+4 = 68 microseconds. 68 * 15 = 1020 machine cycles.
>So at a higher clock rate, it should be even less time.

>My gut feeling is that I think something is wrong with the ISR handler
>and it is stuck. But the performance is definitely way too high.


(Sorry if the format got somewhat I garbled, anything but top-posting have to 
be done manually...)

I re-tested my case using an -O3 optimization (we have been using -O0 during 
development for debugging purposes) and I got a good performance boost, but I'm 
still nowhere near your numbers. I can vouch for that the interrupt (exception 
really) isn't stuck, but that the code unfortunately takes a long time to 
compute. I have a subsecond counter (1/16 of a second) which I'm sampling at 
various places in the code, storing its numbers to a buffer in memory so as to 
interfere with the program as little as possible.

With -O3, a tick handling still takes ~320 us to perform, but the weight has 
now shifted. tc_windup takes ~214 us and the rest is obviously 
_Watchdog_Tick(). When fragmenting the tc_windup function to find the worst 
speed bumps the biggest contribution (~122 us) seem to be coming from scale 
factor recalculation. Since it's 64 bits, it's turned into a software function 
which can be quite time-consuming apparently.

Even though _Watchdog_Tick() "only" takes ~100 us now, it still sound much 
higher than your total tick with a slower system (we're running at 50 MHz).

Is there anything we can do to improve these numbers? Is Clock_isr intended to 
be run uninterrupted as it is now? Can't see that much of the BSP patch code 
has anything to do with the speed of what I'm looking at right now...


Jakob Viketoft
Senior Engineer in RTL and embedded software

ÅAC Microtec AB
Dag Hammarskjölds väg 48
SE-751 83 Uppsala, Sweden

T: +46 702 80 95 97
devel mailing list

Reply via email to