Re: [Xenomai-core] Testing the adeos-ipipe-2.6.13-ppc-1.0-00.patch

2005-10-19 Thread Wolfgang Grandegger
On 10/17/2005 05:42 PM Fillod Stephane wrote:
 Hi Philippe,
 
 Sorry for the late report, Xenomai appears to work fine on a Freescale
 e500
 board (MPC8541E) under Linux 2.6.13. Xenomai version was v1.9.9, ie. the
 daily
 snapshot as of today. Here are some preliminary figures (CPU 800MHz, Bus
 133MHz, 
 32 kiB I-Cache 32 kiB D-Cache, 256 kiB L2):
 
 switch $ ./run
 == Sampling period: 100 us
 RTH| lat min| lat avg| lat max|lost
 RTD|3660|3690|8070|   0
 
 kaltency $ ./run
 RTH|klat min|klat avg|klat max| overrun|
 RTS|   -7350|   -5715|6420|   0|
 00:03:17/00:03:17
 
 latency $ ./run
 == Sampling period: 100 us
 RTT|  00:08:04
 RTH|-lat min|-lat avg|-lat max|-overrun|
 RTS|   -6930|   -4260|8700|   0|
 00:08:06/00:08:06
 
 Load for klatency/latency was ping flooding on FCC (piece of cake),
 and cache calibrator. IMHO, we can do nastier.

You mean the cache calibrator from http://monetdb.cwi.nl/Calibrator/? I
tried it on my Ocotea board and it increased the max latency for 25 to
30 us.

Thanks.

Wolfgang.




RE: [Xenomai-core] Testing the adeos-ipipe-2.6.13-ppc-1.0-00.patch

2005-10-19 Thread Fillod Stephane
Wolfgang Grandegger wrote:
[...]
 Load for klatency/latency was ping flooding on FCC (piece of cake),
 and cache calibrator. IMHO, we can do nastier.

You mean the cache calibrator from http://monetdb.cwi.nl/Calibrator/? I
tried it on my Ocotea board and it increased the max latency for 25 to
30 us.

Yes, that very one. In this case, it has been used as a cache trashing
load generator. But IMHO, this Calibrator should be better used in the
Benchmarking Plan to get L1/L2/RAM access latency figures (w/o RT
running),
and offer one more correlation against RT latency results.

We can afford a better cache trashing load generator. Earlier this year,
I proposed flushy(tm) [1], but as Philippe suggested, we can do better.
Flushy should be rewritten as an ADEOS layer, inserted just in front of 
Xenomai in the pipeline. This way, we would be sure the caches
are dead cold when Xenomai enter its domain. Using tools like OProfile,
it should be possible then to track cache misses, and fix them 
by prefetching, where available.

[1] http://rtai.dk/cgi-bin/gratiswiki.pl?Latency_Killer (bottom of page)


Here is the result of my 1.0-01 tests on e500:

$ cat /proc/ipipe/version
1.0-01

SWITCH without load:
RTH| lat min| lat avg| lat max|lost
RTD|3660|3690|8070|   0 1.0-00
RTD|4620|4740|8730|   0 1.0-01

KLATENCY with load:
RTH|-lat min|-lat avg|-lat max|-overrun|
RTS|   -7350|   -5715|6420|   0|00:03:17 1.0-00
RTS|   -6150|   -4384|   12180|   0|00:03:13 1.0-01

LATENCY with load:
== Sampling period: 100 us
RTH|-lat min|-lat avg|-lat max|-overrun|
RTS|   -6930|   -4260|8700|   0|00:08:06 1.0-00
RTS|   -5670|   -4620|   12930|   0|00:12:39 1.0-01

That's weird. Figures are worse, but since the load (ping -f +
calibrator)
was executed manually, it may not be the same.

-- 
Stephane




Re: [Xenomai-core] Testing the adeos-ipipe-2.6.13-ppc-1.0-00.patch

2005-10-19 Thread Philippe Gerum

Fillod Stephane wrote:

Wolfgang Grandegger wrote:
[...]


Load for klatency/latency was ping flooding on FCC (piece of cake),
and cache calibrator. IMHO, we can do nastier.


You mean the cache calibrator from http://monetdb.cwi.nl/Calibrator/? I
tried it on my Ocotea board and it increased the max latency for 25 to
30 us.



Yes, that very one. In this case, it has been used as a cache trashing
load generator. But IMHO, this Calibrator should be better used in the
Benchmarking Plan to get L1/L2/RAM access latency figures (w/o RT
running),
and offer one more correlation against RT latency results.

We can afford a better cache trashing load generator. Earlier this year,
I proposed flushy(tm) [1], but as Philippe suggested, we can do better.
Flushy should be rewritten as an ADEOS layer, inserted just in front of 
Xenomai in the pipeline. This way, we would be sure the caches

are dead cold when Xenomai enter its domain. Using tools like OProfile,
it should be possible then to track cache misses, and fix them 
by prefetching, where available.


[1] http://rtai.dk/cgi-bin/gratiswiki.pl?Latency_Killer (bottom of page)


Here is the result of my 1.0-01 tests on e500:

$ cat /proc/ipipe/version
1.0-01

SWITCH without load:
RTH| lat min| lat avg| lat max|lost
RTD|3660|3690|8070|   0 1.0-00
RTD|4620|4740|8730|   0 1.0-01

KLATENCY with load:
RTH|-lat min|-lat avg|-lat max|-overrun|
RTS|   -7350|   -5715|6420|   0|00:03:17 1.0-00
RTS|   -6150|   -4384|   12180|   0|00:03:13 1.0-01

LATENCY with load:
== Sampling period: 100 us
RTH|-lat min|-lat avg|-lat max|-overrun|
RTS|   -6930|   -4260|8700|   0|00:08:06 1.0-00
RTS|   -5670|   -4620|   12930|   0|00:12:39 1.0-01

That's weird. Figures are worse, but since the load (ping -f +
calibrator)
was executed manually, it may not be the same.



Ok, I now suspect that another change regarding the size of the interrupt 
counters made this worse. I'm going to revert it and upload -02, just to make sure.


--

Philippe.



Re: [Xenomai-core] Testing the adeos-ipipe-2.6.13-ppc-1.0-00.patch

2005-10-19 Thread Philippe Gerum

Fillod Stephane wrote:

Philippe Gerum wrote:
[..]

http://download.gna.org/adeos/patches/v2.6/adeos/ppc/adeos-ipipe-2.6.13-
ppc-1.0-02.patch

Here is the result of tests with version 1.0-02 on e500:

load: ~1 minute ping -f, one run of calibrator chewing 64MiB.

$ cat /proc/ipipe/version
1.0-02

SWITCH without load:
RTH| lat min| lat avg| lat max|lost
RTD|3660|3690|8070|   01.0-00
RTD|4620|4740|8730|   01.0-01
RTD|4620|4740|8190|   01.0-02

KLATENCY with load:
RTH|-lat min|-lat avg|-lat max|-overrun|
RTS|   -7350|   -5715|6420|   0|00:03:17 1.0-00
RTS|   -6150|   -4384|   12180|   0|00:03:13 1.0-01
RTS|   -6150|   -4183|   12480|   0|00:03:38 1.0-02

LATENCY with load:
== Sampling period: 100 us
RTH|-lat min|-lat avg|-lat max|-overrun|
RTS|   -6930|   -4260|8700|   0|00:08:06 1.0-00
RTS|   -5670|   -4620|   12930|   0|00:12:39 1.0-01
RTS|   -5700|   -3750|   11280|   0|00:06:05 1.0-02

It looks like the char vs. long in the 1.0-0[12] patch was not the
culprit,


The last significant change between -00 and -01 is actually the one related to 
the fork pressure (others are cosmetic ones aimed at better sharing stuff with 
the blackfin port). The patch below against -02 removes it.


--- 2.6.13/arch/ppc/kernel/entry.S~ 2005-10-18 18:42:09.0 +0200
+++ 2.6.13/arch/ppc/kernel/entry.S  2005-10-19 15:07:54.0 +0200
@@ -316,10 +316,8 @@

.globl  ret_from_fork
 ret_from_fork:
-   STALL_ROOT_COND
REST_NVGPRS(r1)
bl  schedule_tail
-   UNSTALL_ROOT_COND
li  r3,0
b   ret_from_syscall


at least not on e500. I'll do the bench again on 1.0-00. Man, if only we
had that automated benchmark suite...



Indeed... The positive thing being that, we now have the ultimate proof of its 
usefulness :o


--

Philippe.



Re: [Xenomai-core] Testing the adeos-ipipe-2.6.13-ppc-1.0-00.patch

2005-10-17 Thread Wolfgang Grandegger
On 10/15/2005 09:17 PM Heikki Lindholm wrote:
 Wolfgang Grandegger kirjoitti:
 Hello Philippe,
 
 I got Xenomai working on a Ocotea-Board (AMCC 440GX) and a low-end
 TQM855L-Module (MPC 855) under Linux 2.6.14-rc3 :-). The patch applied
 with a few hunks and one easy to fix reject and I had to correct two
 problems. One with FEW_CONTEXT (see attached patch) and the second with
 #include asm/offsets.h in xenomai/arch/ppc/hal/switch.S. The
 include file does not exist (any more) in the kernel tree and therefore
 I commented out the line. I'm going to perform latency tests on various
 4xx and 8xx boards next week. Here are some preliminary figures of the
 TQM855L-Module (CPU 80 MHz, Bus 40 MHz, 4 kB I-Cache 4 kB D-Cache):
 
 If you happen to know some (semi-)comparable figures for the same boards 
 using some commercial RTOS, it would be nice to know them also, for 
 comparison.

Well, we only deal with free software. But I can compare the result
from the klatency test with the one from RTAI/RTHAL under Linux 2.4, of
course.

Wolfgang.



RE: [Xenomai-core] Testing the adeos-ipipe-2.6.13-ppc-1.0-00.patch

2005-10-17 Thread Fillod Stephane
Hi Philippe,

Sorry for the late report, Xenomai appears to work fine on a Freescale
e500
board (MPC8541E) under Linux 2.6.13. Xenomai version was v1.9.9, ie. the
daily
snapshot as of today. Here are some preliminary figures (CPU 800MHz, Bus
133MHz, 
32 kiB I-Cache 32 kiB D-Cache, 256 kiB L2):

switch $ ./run
== Sampling period: 100 us
RTH| lat min| lat avg| lat max|lost
RTD|3660|3690|8070|   0

kaltency $ ./run
RTH|klat min|klat avg|klat max| overrun|
RTS|   -7350|   -5715|6420|   0|
00:03:17/00:03:17

latency $ ./run
== Sampling period: 100 us
RTT|  00:08:04
RTH|-lat min|-lat avg|-lat max|-overrun|
RTS|   -6930|   -4260|8700|   0|
00:08:06/00:08:06

Load for klatency/latency was ping flooding on FCC (piece of cake),
and cache calibrator. IMHO, we can do nastier.


Thanks!

-- 
Stephane

PS: some rtai skin patches are to be expected




Re: [Xenomai-core] Testing the adeos-ipipe-2.6.13-ppc-1.0-00.patch

2005-10-17 Thread Philippe Gerum

Fillod Stephane wrote:

Hi Philippe,

Sorry for the late report, Xenomai appears to work fine on a Freescale
e500
board (MPC8541E) under Linux 2.6.13. Xenomai version was v1.9.9, ie. the
daily
snapshot as of today. Here are some preliminary figures (CPU 800MHz, Bus
133MHz, 
32 kiB I-Cache 32 kiB D-Cache, 256 kiB L2):


switch $ ./run
== Sampling period: 100 us
RTH| lat min| lat avg| lat max|lost
RTD|3660|3690|8070|   0

kaltency $ ./run
RTH|klat min|klat avg|klat max| overrun|
RTS|   -7350|   -5715|6420|   0|
00:03:17/00:03:17

latency $ ./run
== Sampling period: 100 us
RTT|  00:08:04
RTH|-lat min|-lat avg|-lat max|-overrun|
RTS|   -6930|   -4260|8700|   0|
00:08:06/00:08:06



Great you tested that, thanks. The calibration looks a bit pessimistic, so I 
guess that a narrowed one would leave us with something in the 10-12 us range 
worst-case in user-space, which would still be quite decent.



Load for klatency/latency was ping flooding on FCC (piece of cake),
and cache calibrator. IMHO, we can do nastier.




Mixed LTP stuff and dd loops are quite good punishers AFAICS here.

--

Philippe.



Re: [Xenomai-core] Testing the adeos-ipipe-2.6.13-ppc-1.0-00.patch

2005-10-15 Thread Heikki Lindholm

Wolfgang Grandegger kirjoitti:

Hello Philippe,

I got Xenomai working on a Ocotea-Board (AMCC 440GX) and a low-end
TQM855L-Module (MPC 855) under Linux 2.6.14-rc3 :-). The patch applied
with a few hunks and one easy to fix reject and I had to correct two
problems. One with FEW_CONTEXT (see attached patch) and the second with
#include asm/offsets.h in xenomai/arch/ppc/hal/switch.S. The
include file does not exist (any more) in the kernel tree and therefore
I commented out the line. I'm going to perform latency tests on various
4xx and 8xx boards next week. Here are some preliminary figures of the
TQM855L-Module (CPU 80 MHz, Bus 40 MHz, 4 kB I-Cache 4 kB D-Cache):


If you happen to know some (semi-)comparable figures for the same boards 
using some commercial RTOS, it would be nice to know them also, for 
comparison.


-- Heikki Lindholm