Abramo Bagnara wrote:
> 
> However if I interpret right the web page:
> 
> "The benchmark times how long it takes to go from one process to another
> when the context switch is done by reading and writing data."
> 
> Then I suppose this is not the pure context switch time, but it's
> cumulated with read/write syscall and copy time that seems to be
> significative. Note also that in current kernel code pipe needs a double
> copy (Manfred Spraul is improving that in recent past days).
> 
> If (I underline "if") the cost is ~2 microseconds this means 0.1538% of
> CPU use per context switch.
> 
> Do you think I'm missing something?
> 
> I'll try to do some tests here (with a k6 II @ 333Mhz: a little machine)
> using sched_yield and rdtsc. Probably I can obtain more data.

I attach my test program (don't try it on SMP).
Results are very interesting:

$ cat /proc/cpuinfo 
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 5
model           : 8
model name      : AMD-K6(tm) 3D processor
stepping        : 0
cpu MHz         : 333.113
cache size      : 64 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr mce cx8 mmx syscall 3dnow
bogomips        : 663.55
$ cat /proc/version 
Linux version 2.4.4 (root@igor) (gcc version 2.96 20000731 (Red Hat
Linux 7.1 2.96-81)) #5 SMP sab mag 5 19:00:39 CEST 2001
$ ./ctx 2
0: 499998 iterations
1: 500002 iterations
min=893 max=397782 avg=918
$ ./ctx 2
0: 499992 iterations
1: 500008 iterations
min=886 max=85798 avg=917
$ ./ctx 2
0: 499997 iterations
1: 500003 iterations
min=893 max=85635 avg=918
$ bc
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'. 
1000000000*918/333113000
2755

Average time for a context switch:
2755ns = 2.755us

Another machine:
$ cat /proc/cpuinfo 
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 8
model name      : Pentium III (Coppermine)
stepping        : 6
cpu MHz         : 797.995
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca
cmov pat pse36 mmx fxsr sse
bogomips        : 1592.52
$ cat /proc/version 
Linux version 2.4.4-xfs (root@linux1) (gcc version 2.96 20000731 (Red
Hat Linux 7.1 2.96-84)) #12 mer mag 9 11:47:14 CEST 2001
$ ./ctx 2
0: 499999 iterations
1: 500001 iterations
min=556 max=57779 avg=570
$ ./ctx 2
0: 499999 iterations
1: 500001 iterations
min=556 max=42148 avg=570
$ ./ctx 2
0: 500001 iterations
1: 499999 iterations
min=559 max=183607 avg=571
$ bc
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'. 
1000000000*571/797995000
715

Average time for a context switch:
715ns = 0.715us

I don't know if to use fork would change significatively this numbers.

-- 
Abramo Bagnara                       mailto:[EMAIL PROTECTED]

Opera Unica                          Phone: +39.546.656023
Via Emilia Interna, 140
48014 Castel Bolognese (RA) - Italy

ALSA project               http://www.alsa-project.org
It sounds good!

#include <limits.h>
#include <pthread.h>
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <asm/msr.h>

unsigned long long old;
unsigned long min_diff = LONG_MAX, max_diff = 0;
unsigned long long sum_diff = 0;
unsigned int count = 0;
unsigned int start = 0;

unsigned int distr[100];

void *thread(void *data)
{
        long long prev, diff;
        unsigned int k = (unsigned int) data;
        volatile int *s = &start;
        while (!*s);
        while (count < 1000000) {
                prev = old;
                rdtscll(old);
                diff = old - prev;
                if (diff < min_diff) {
                        min_diff = diff;
                }
                if (diff > max_diff) {
                        max_diff = diff;
                }
                sum_diff += diff;
                count++;
                distr[k]++;
                sched_yield();
        }
        return 0;
}

int main(int argc, char **argv)
{
        int num_threads = atoi(argv[1]);
        pthread_t threads[num_threads];
        int k;
        for (k = 0; k < num_threads; ++k)
                pthread_create(&threads[k], NULL, thread, (void*) k);
        rdtscll(old);
        start = 1;
        for (k = 0; k < num_threads; ++k) {
                pthread_join(threads[k], 0);
                printf("%d: %d iterations\n", k, distr[k]);
        }
        printf("min=%ld max=%ld avg=%lld\n", min_diff, max_diff, sum_diff / count);
        return 0;
}

Reply via email to