Re: [Xenomai-core] [PATCH 0/3] Reworked nucleus statistics

2006-10-27 Thread Philippe Gerum
On Thu, 2006-10-12 at 21:43 +0200, Jan Kiszka wrote:
> Hi,
> 
> here we go: after quite some hacking and refactoring, Dmitry and I are
> happy to provide a patch set that reworks and enhances the statistics
> subsystem of Xenomai.
> 
> The original goal of these patches was to improve the accuracy of the
> /proc/xenomai/stat CPU load output. So far it only accounted thread
> switches. The time of potential preceding IRQ handling and scheduling
> decision was added to the preempted thread. The new approach avoids
> this. More about it later.
> 
> While discussing the first implementation, Dmitry had the idea to
> refactor even more code that depends on XENO_OPT_STATS, means event
> counting parts under the upcoming generic runtime stats. So this series
> starts with a patch to introduce a generic subsystem for collecting
> statistics on countable events as well as the runtime of entities.
> 
> It continues with the second patch that applies xnstat on the IRQ
> subsystem, both for counting hits as well as for measuring the execution
> time. The accounting model applied in this patch is as simple as this:
> measure the time some driver- or application-supplied ISR executes and
> accumulate it per-CPU. The rescheduling is still accounted to the
> preempted thread.
> 
> In my endless quest for perfection, I applied an -as I feel- enhanced
> model on top of this (already working!) set, that's the third patch.
> This model adds the scheduler path to the IRQ account. And it only
> accounts to an IRQ if its ISR reported XN_ISR_HANDLED. This is relevant
> for shared IRQs when only one source fired (the typical case). Also, it
> reduces churning by avoiding account switches in the average case. But,
> the downside, it may be less convenient to understand and increases the
> code a bit (only for the shared IRQ case). Dmitry and I were not yet
> able to agree on THE model, so I'm simply posting both for public
> feedback. :)

All merged, thanks.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH 0/3] Reworked nucleus statistics

2006-10-19 Thread Philippe Gerum
On Wed, 2006-10-18 at 22:06 +0200, Dmitry Adamushko wrote:
> 
> ...
> 
> Dmitry, are there other issues I'm missing, that you think
> would be
> better solved using a simpler accounting model?
> 
> So according to you, the more complex model gives users a better
> understanding of behavior of their drivers/applications?
> 

I said the latter goal should be considered as most important one when
discussing this particular issue. Each approach having its pros/cons
regarding this goal depending on the aspect considered
(shared/non-shared, accounting accuracy etc.), the choice is about
picking the one which fulfills this goal as much as possible.

Basically, your respective approaches (i.e. Jan and yours) differ on the
criterion measuring such "fulfillment", and since there is likely no
perfect solution, we should make a trade-off on that. To sum up, you
consider that unambiguous shared IRQ accounting carries more weight,
whilst Jan seems to put more weight on charging any rescheduling cost
that could result from the ISR to the proper statistic counter, i.e. the
ISR, and not any random thread the corresponding IRQ happened to
preempt.

My guts feeling, as a possible user of this code, would be to favour
Jan's option, not because yours is essentially wrong, but rather because
it's missing what Jan's solution brings regarding the rescheduling
issue. In practice, a rescheduling costs a _lot_ more to perform
time-wise by the nucleus, than the less ambiguous measurement you could
do from an ISR prologue. My point is that, in that particular case, I'd
always prefer getting the right information from the most significant
order of magnitude.

Regarding the accounting upon XN_IRQ_HANDLED status, well, it's again a
matter of trade-off, even if the two options seem much more balanced in
their respective merits and caveats. I could vote for any of them; this
said, I think Jan's argument about shared IRQs being unusual (and I
would even add sub-optimal in a RT context) should be taken for what it
says: we should better try optimizing for the common case when there is
no real win in optimizing for the rare one. My initial question boils
down to make sure that I did not underestimate the importance of
improving the rare case.

[...]

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH 0/3] Reworked nucleus statistics

2006-10-19 Thread Dmitry Adamushko
On 18/10/06, Jan Kiszka <[EMAIL PROTECTED]> wrote:
>> (1) time spent in ISR;>> (2) some prologue for the very first shared handler and epilogue - for the> last one.(2) is the rare case that multiple IRQs happen at the same time on the
same line. As I said, the case when only one occurs and the others areneedlessly tested is more often.
Indeed, it looks like "prologue" and "epilogue" are always accounted to
the ISR that reported XN_ISR_HANDLES. And if it's only one, then its
position in the chain doesn't matter.

> The "simple" model accounts only (1) to ISR time statistics so it's always
> easy to say what's this number means. It just describes the load caused> by a> corresponding ISR handler.[sorry, Dmitry, for this replay ;)]
[off-topic]
I somehow read it as "sorry for this _reply_" and thought you had
expressed below all you think about my complains and myself... but, ok.
you are more polit-correct :)))
I designed this instrumentation with a fairly old question of mine inthe head: "What does raising this IRQ, say, at 1 KHz costs the system?"
And this question includes not just the ISR itself, but also thedisturbance due to potential rescheduling.
Ok. 
Shared Interrupt are indeed an exception, and if we were only discussingnon-shared instrumentation here, I guess it wouldn't be that tricky to
find a common model. If you look at that part, I'm basically shiftingsome instrumentation point after the rescheduling, that's it.
yes, per-IRQ accounting (as opposed to per-ISR) wouldn't have a background for such objections.

Well, if we conclude that the enhanced scheme provides better
informativeness and everyone benefiits from it, then let it be so. I
don't have any more objections [chores : uufff, haleluia!]

Jan-- Best regards,Dmitry Adamushko


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH 0/3] Reworked nucleus statistics

2006-10-18 Thread Jan Kiszka
Dmitry Adamushko wrote:
>> ...
>>
>> Dmitry, are there other issues I'm missing, that you think would be
>> better solved using a simpler accounting model?
> 
> 
> So according to you, the more complex model gives users a better
> understanding of behavior of their drivers/applications?
> 
> ...
>>   0  0  0  201357   0   2.8  IRQ1: handler1
>>   0  0  0  5811   0   2.4  IRQ1: handler2
> 
> what is % in this case?
> 
> it accounts :
> 
> (1) time spent in ISR;
> 
> (2) some prologue for the very first shared handler and epilogue - for the
> last one.

(2) is the rare case that multiple IRQs happen at the same time on the
same line. As I said, the case when only one occurs and the others are
needlessly tested is more often.

> 
> So in fact, part (1) can be the same for both handlers (say, just the same
> code), N can be == M (irq frequency) but % numbers are different. Is it
> fair?
> 
> The "simple" model accounts only (1) to ISR time statistics so it's always
> easy to say what's this number means. It just describes the load caused
> by a
> corresponding ISR handler.

[sorry, Dmitry, for this replay ;)]
I designed this instrumentation with a fairly old question of mine in
the head: "What does raising this IRQ, say, at 1 KHz costs the system?"
And this question includes not just the ISR itself, but also the
disturbance due to potential rescheduling.

> 
> Ok, if somebody wants to tell me that shared interrupts is an exceptional
> and rare case, then I got it. But anyway, as long as we want to provide
> per-ISR (and not per-IRQ) statistics, it adds some unclearness (unfairness)
> on how to consider reported values.

Shared Interrupt are indeed an exception, and if we were only discussing
non-shared instrumentation here, I guess it wouldn't be that tricky to
find a common model. If you look at that part, I'm basically shifting
some instrumentation point after the rescheduling, that's it.

> 
> I also expected that those prologue and epilogue are likely to cause much
> lighter "disturbance" being added to a preempted thread (because such
> threads normally should have higher % load wrt ISR anyway).
> 
> Regarding accounting only XN_ISR_HANDLED cases. At least, I suppose,
> ISR[n+1] doesn't get accounted the interval of time spent by ISR[n] to
> report XN_ISR_NONE?
> 
> Which brings us to the next point that I didn't get it from the 3-d patch
> after looking at it brielfy (well, I didn't apply it yet).
> And in general, it's getting a bit more difficult to see interrupt handling
> details behind statistic-handling ones, esp. in the case of shared
> interrupts. Of course, that's really a minor issue as long as users may get
> better statistic information :)
> 
> ok, that was kind of a summary, I hope I didn't forget anything else as we
> already had quite a long discussion with Jan.
> 
> all in all (yep, I like repeating myself :) IMHO the simple model, at
> least,
> clearly answeres a question what's behind the numbers while the complex one
> makes the answer more complex trying to be fair in one place and at the
> same
> time adding "unfairness" in another place.
> 

I don't see that the remaining unfair parts are significant - at least
/wrt what I want to analyse here.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH 0/3] Reworked nucleus statistics

2006-10-18 Thread Dmitry Adamushko

...Dmitry, are there other issues I'm missing, that you think would bebetter solved using a simpler accounting model?

So according to you, the more complex model gives users a better understanding of behavior of their drivers/applications?

...
>   0  0      0        
 201357   0          
2.8  IRQ1: handler1
>   0  0      0        
 5811       0       
   2.4  IRQ1: handler2

what is % in this case?

it accounts :

(1) time spent in ISR;

(2) some prologue for the very first shared handler
and epilogue - for the last one.

So in fact, part (1) can be the same for both handlers (say, just the
same code), N can be == M (irq frequency) but % numbers are different.
Is it fair?

The "simple" model accounts only (1) to ISR time statistics so it's
always easy to say what's this number means. It just describes the load
caused by a corresponding ISR handler.

Ok, if somebody wants to tell me that shared interrupts is an
exceptional and rare case, then I got it. But anyway, as long as we
want to provide per-ISR (and not per-IRQ) statistics, it adds some
unclearness (unfairness) on how to consider reported values.

I also expected that those prologue and epilogue are likely to cause
much lighter "disturbance" being added to a preempted thread (because
such threads normally should have higher % load wrt ISR anyway).
Regarding accounting only XN_ISR_HANDLED cases. At least, I
suppose, ISR[n+1] doesn't get accounted the interval of time spent by
ISR[n] to report XN_ISR_NONE?

Which brings us to the next point that I didn't get it from the 3-d
patch after looking at it brielfy (well, I didn't apply it yet).
And in general, it's getting a bit more difficult to see interrupt
handling details behind statistic-handling ones, esp. in the case of
shared interrupts. Of course, that's really a minor issue as long as
users may get better statistic information :)ok, that was kind of a summary, I hope I didn't forget anything else as we already had quite a long discussion with Jan.

all in all (yep, I like repeating myself :) IMHO the simple model, at
least, clearly answeres a question what's behind the numbers while the
complex one makes the answer more complex trying to be fair in one
place and at the same time adding "unfairness" in another place.

--Philippe.-- Best regards,
Dmitry Adamushko


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH 0/3] Reworked nucleus statistics

2006-10-18 Thread Philippe Gerum
On Thu, 2006-10-12 at 21:43 +0200, Jan Kiszka wrote:

[...]

> It continues with the second patch that applies xnstat on the IRQ
> subsystem, both for counting hits as well as for measuring the execution
> time. The accounting model applied in this patch is as simple as this:
> measure the time some driver- or application-supplied ISR executes and
> accumulate it per-CPU. The rescheduling is still accounted to the
> preempted thread.
> 
> In my endless quest for perfection, I applied an -as I feel- enhanced
> model on top of this (already working!) set, that's the third patch.
> This model adds the scheduler path to the IRQ account. And it only
> accounts to an IRQ if its ISR reported XN_ISR_HANDLED. This is relevant
> for shared IRQs when only one source fired (the typical case). Also, it
> reduces churning by avoiding account switches in the average case. But,
> the downside, it may be less convenient to understand and increases the
> code a bit (only for the shared IRQ case). Dmitry and I were not yet
> able to agree on THE model, so I'm simply posting both for public
> feedback. :)

The bottom-line is that /proc/xenomai/stats should provide information
allowing people to better understand how their applications/drivers
perform, which has a higher priority than allowing us to understand how
the nucleus behaves. In this vein, having the rescheduling path
accounted and charged to the ISR - and not to the preempted thread -
provides a more accurate information, since this operation is quite
significant time-wise, and it seems unfair and inaccurate to charge any
random thread for this.

The issue about whether we should integrate every bit of the ISR runtime
into the accounted value (including source detection for shared
interrupts), or not by filtering on XN_ISR_HANDLED, seems related to the
previous point. At first sight, I would also filter out unhandled IRQs,
given that accepting an interrupt and performing actions upon it should
be more significant time-wise than solely probing the IRQ sources. 

Dmitry, are there other issues I'm missing, that you think would be
better solved using a simpler accounting model?

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PATCH 0/3] Reworked nucleus statistics

2006-10-12 Thread Jan Kiszka
Hi,

here we go: after quite some hacking and refactoring, Dmitry and I are
happy to provide a patch set that reworks and enhances the statistics
subsystem of Xenomai.

The original goal of these patches was to improve the accuracy of the
/proc/xenomai/stat CPU load output. So far it only accounted thread
switches. The time of potential preceding IRQ handling and scheduling
decision was added to the preempted thread. The new approach avoids
this. More about it later.

While discussing the first implementation, Dmitry had the idea to
refactor even more code that depends on XENO_OPT_STATS, means event
counting parts under the upcoming generic runtime stats. So this series
starts with a patch to introduce a generic subsystem for collecting
statistics on countable events as well as the runtime of entities.

It continues with the second patch that applies xnstat on the IRQ
subsystem, both for counting hits as well as for measuring the execution
time. The accounting model applied in this patch is as simple as this:
measure the time some driver- or application-supplied ISR executes and
accumulate it per-CPU. The rescheduling is still accounted to the
preempted thread.

In my endless quest for perfection, I applied an -as I feel- enhanced
model on top of this (already working!) set, that's the third patch.
This model adds the scheduler path to the IRQ account. And it only
accounts to an IRQ if its ISR reported XN_ISR_HANDLED. This is relevant
for shared IRQs when only one source fired (the typical case). Also, it
reduces churning by avoiding account switches in the average case. But,
the downside, it may be less convenient to understand and increases the
code a bit (only for the shared IRQ case). Dmitry and I were not yet
able to agree on THE model, so I'm simply posting both for public
feedback. :)

Here are some numbers I collected today on a P-III 700 MHz, running our
3D laser range scanner + some post-processing steps (sensor fusion and
2D mapping) and forwarding those data via TCP/IP. That box makes use of
the xeno_16550A driver, collecting serial streams at 115k2 and 500k over
a shared IRQ (you may guess which one belongs to what IRQ :)).

ISR accounting (patch 2/3):
> CPU  PIDMSWCSWPFSTAT   %CPU  NAME
>   0  0  0  11160  0 01400080   73.0  ROOT
>   0  0  0  8530 00820.0  timsPipeReceiver
>   0  8760  61 0 00d000820.0  LadarSickLms2002C
>   0  8780     0 00d000860.4  LadarSickLms2002D
>   0  8790  59 0 00d000820.0  Scan2dVirtual3C
>   0  8800  1207   0 00d000840.0  Scan2dVirtual3D
>   0  8810  1670 00d000820.0  Scan3DScanDriveSick0C
>   0  8820  2935   0 00d00086   10.4  Scan3DScanDriveSick0D
>   0  8870  1400 00d000820.0  ServoDriveScanDrive0C
>   0  8880  2509   0 00d000860.1  ServoDriveScanDrive0D
>   0  0  0  41913  0 0.1  IRQ0: [timer]
>   0  0  0  176870 0    13.5  IRQ5: rtser5
>   0  0  0  11121  0 2.4  IRQ5: rtser4

Enhanced accounting (patch 3/3):
> CPU  PIDMSWCSWPFSTAT   %CPU  NAME
>   0  0  0  11384  0 01400080   72.4  ROOT
>   0  0  0  7450 00820.0  timsPipeReceiver
>   0  8840  56 0 00d000820.0  LadarSickLms2002C
>   0  8870  7282   0 00d000820.4  LadarSickLms2002D
>   0  8850  1400 00d000820.0  Scan3DScanDriveSick0C
>   0  8890  3066   0 00d00086   10.4  Scan3DScanDriveSick0D
>   0  8880  54 0 00d000820.0  Scan2dVirtual3C
>   0  8900  1069   0 00d000840.0  Scan2dVirtual3D
>   0  8970  76 0 00d000820.0  ServoDriveScanDrive0C
>   0  8980  2271   0 00d000860.1  ServoDriveScanDrive0D
>   0  0  0  50700  0 0.1  IRQ0: [timer]
>   0  0  0  201357 0    14.1  IRQ5: rtser5
>   0  0  0  5811   0 2.4  IRQ5: rtser4

So the difference between both accounting models is not dramatic, but
noticeable. Clearly, the variation increases with every high rate IRQ
source you add (rtser5 was about 7 KHz) and with every step down the CPU
performance ladder.

Ok, enough talk, we are looking forward to feedback now! What
specifically needs testing is SMP, I only once ran the patches under a
2-way qemu.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core