Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-28 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> Andrew,
> 
> could we try this one in -mm? It unifies (and simplifies) the TSC sync 
> code between i386 and x86_64, and also offers a stronger guarantee 
> that we'll only activate the TSC clock on CPU where the TSC is synced 
> correctly by the hardware.

updated patch below. (Mike Galbraith reported that suspend broke on -rt 
kernels, it was due to an __init/__cpuinit mismatch)

Ingo

->
Subject: x86: rewrite SMP TSC sync code
From: Ingo Molnar <[EMAIL PROTECTED]>

make the TSC synchronization code more robust, and unify
it between x86_64 and i386.

The biggest change is the removal of the 'fix up TSCs' code
on x86_64 and i386, in some rare cases it was /causing/
time-warps on SMP systems.

The new code only checks for TSC asynchronity - and if it can
prove a time-warp (if it can observe the TSC going backwards
when going from one CPU to another within a critical section),
then the TSC clock-source is turned off.

The TSC synchronization-checking code also got moved into a
separate file.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 arch/i386/kernel/Makefile |2 
 arch/i386/kernel/smpboot.c|  178 ++--
 arch/i386/kernel/tsc.c|4 
 arch/i386/kernel/tsc_sync.c   |1 
 arch/x86_64/kernel/Makefile   |2 
 arch/x86_64/kernel/smpboot.c  |  230 ++
 arch/x86_64/kernel/time.c |   11 ++
 arch/x86_64/kernel/tsc_sync.c |  187 ++
 include/asm-i386/tsc.h|   49 
 include/asm-x86_64/proto.h|2 
 include/asm-x86_64/timex.h|   26 
 include/asm-x86_64/tsc.h  |   66 
 12 files changed, 295 insertions(+), 463 deletions(-)

Index: linux/arch/i386/kernel/Makefile
===
--- linux.orig/arch/i386/kernel/Makefile
+++ linux/arch/i386/kernel/Makefile
@@ -18,7 +18,7 @@ obj-$(CONFIG_X86_MSR) += msr.o
 obj-$(CONFIG_X86_CPUID)+= cpuid.o
 obj-$(CONFIG_MICROCODE)+= microcode.o
 obj-$(CONFIG_APM)  += apm.o
-obj-$(CONFIG_X86_SMP)  += smp.o smpboot.o
+obj-$(CONFIG_X86_SMP)  += smp.o smpboot.o tsc_sync.o
 obj-$(CONFIG_X86_TRAMPOLINE)   += trampoline.o
 obj-$(CONFIG_X86_MPPARSE)  += mpparse.o
 obj-$(CONFIG_X86_LOCAL_APIC)   += apic.o nmi.o
Index: linux/arch/i386/kernel/smpboot.c
===
--- linux.orig/arch/i386/kernel/smpboot.c
+++ linux/arch/i386/kernel/smpboot.c
@@ -88,12 +88,6 @@ cpumask_t cpu_possible_map;
 EXPORT_SYMBOL(cpu_possible_map);
 static cpumask_t smp_commenced_mask;
 
-/* TSC's upper 32 bits can't be written in eariler CPU (before prescott), there
- * is no way to resync one AP against BP. TBD: for prescott and above, we
- * should use IA64's algorithm
- */
-static int __devinitdata tsc_sync_disabled;
-
 /* Per CPU bogomips and other parameters */
 struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned;
 EXPORT_SYMBOL(cpu_data);
@@ -210,151 +204,6 @@ valid_k7:
;
 }
 
-/*
- * TSC synchronization.
- *
- * We first check whether all CPUs have their TSC's synchronized,
- * then we print a warning if not, and always resync.
- */
-
-static struct {
-   atomic_t start_flag;
-   atomic_t count_start;
-   atomic_t count_stop;
-   unsigned long long values[NR_CPUS];
-} tsc __initdata = {
-   .start_flag = ATOMIC_INIT(0),
-   .count_start = ATOMIC_INIT(0),
-   .count_stop = ATOMIC_INIT(0),
-};
-
-#define NR_LOOPS 5
-
-static void __init synchronize_tsc_bp(void)
-{
-   int i;
-   unsigned long long t0;
-   unsigned long long sum, avg;
-   long long delta;
-   unsigned int one_usec;
-   int buggy = 0;
-
-   printk(KERN_INFO "checking TSC synchronization across %u CPUs: ", 
num_booting_cpus());
-
-   /* convert from kcyc/sec to cyc/usec */
-   one_usec = cpu_khz / 1000;
-
-   atomic_set(_flag, 1);
-   wmb();
-
-   /*
-* We loop a few times to get a primed instruction cache,
-* then the last pass is more or less synchronized and
-* the BP and APs set their cycle counters to zero all at
-* once. This reduces the chance of having random offsets
-* between the processors, and guarantees that the maximum
-* delay between the cycle counters is never bigger than
-* the latency of information-passing (cachelines) between
-* two CPUs.
-*/
-   for (i = 0; i < NR_LOOPS; i++) {
-   /*
-* all APs synchronize but they loop on '== num_cpus'
-*/
-   while (atomic_read(_start) != num_booting_cpus()-1)
-   cpu_relax();
-   atomic_set(_stop, 0);
-   wmb();
-   /*
-* this lets the APs save their current TSC:
-*/
-

Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-28 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 Andrew,
 
 could we try this one in -mm? It unifies (and simplifies) the TSC sync 
 code between i386 and x86_64, and also offers a stronger guarantee 
 that we'll only activate the TSC clock on CPU where the TSC is synced 
 correctly by the hardware.

updated patch below. (Mike Galbraith reported that suspend broke on -rt 
kernels, it was due to an __init/__cpuinit mismatch)

Ingo

-
Subject: x86: rewrite SMP TSC sync code
From: Ingo Molnar [EMAIL PROTECTED]

make the TSC synchronization code more robust, and unify
it between x86_64 and i386.

The biggest change is the removal of the 'fix up TSCs' code
on x86_64 and i386, in some rare cases it was /causing/
time-warps on SMP systems.

The new code only checks for TSC asynchronity - and if it can
prove a time-warp (if it can observe the TSC going backwards
when going from one CPU to another within a critical section),
then the TSC clock-source is turned off.

The TSC synchronization-checking code also got moved into a
separate file.

Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
---
 arch/i386/kernel/Makefile |2 
 arch/i386/kernel/smpboot.c|  178 ++--
 arch/i386/kernel/tsc.c|4 
 arch/i386/kernel/tsc_sync.c   |1 
 arch/x86_64/kernel/Makefile   |2 
 arch/x86_64/kernel/smpboot.c  |  230 ++
 arch/x86_64/kernel/time.c |   11 ++
 arch/x86_64/kernel/tsc_sync.c |  187 ++
 include/asm-i386/tsc.h|   49 
 include/asm-x86_64/proto.h|2 
 include/asm-x86_64/timex.h|   26 
 include/asm-x86_64/tsc.h  |   66 
 12 files changed, 295 insertions(+), 463 deletions(-)

Index: linux/arch/i386/kernel/Makefile
===
--- linux.orig/arch/i386/kernel/Makefile
+++ linux/arch/i386/kernel/Makefile
@@ -18,7 +18,7 @@ obj-$(CONFIG_X86_MSR) += msr.o
 obj-$(CONFIG_X86_CPUID)+= cpuid.o
 obj-$(CONFIG_MICROCODE)+= microcode.o
 obj-$(CONFIG_APM)  += apm.o
-obj-$(CONFIG_X86_SMP)  += smp.o smpboot.o
+obj-$(CONFIG_X86_SMP)  += smp.o smpboot.o tsc_sync.o
 obj-$(CONFIG_X86_TRAMPOLINE)   += trampoline.o
 obj-$(CONFIG_X86_MPPARSE)  += mpparse.o
 obj-$(CONFIG_X86_LOCAL_APIC)   += apic.o nmi.o
Index: linux/arch/i386/kernel/smpboot.c
===
--- linux.orig/arch/i386/kernel/smpboot.c
+++ linux/arch/i386/kernel/smpboot.c
@@ -88,12 +88,6 @@ cpumask_t cpu_possible_map;
 EXPORT_SYMBOL(cpu_possible_map);
 static cpumask_t smp_commenced_mask;
 
-/* TSC's upper 32 bits can't be written in eariler CPU (before prescott), there
- * is no way to resync one AP against BP. TBD: for prescott and above, we
- * should use IA64's algorithm
- */
-static int __devinitdata tsc_sync_disabled;
-
 /* Per CPU bogomips and other parameters */
 struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned;
 EXPORT_SYMBOL(cpu_data);
@@ -210,151 +204,6 @@ valid_k7:
;
 }
 
-/*
- * TSC synchronization.
- *
- * We first check whether all CPUs have their TSC's synchronized,
- * then we print a warning if not, and always resync.
- */
-
-static struct {
-   atomic_t start_flag;
-   atomic_t count_start;
-   atomic_t count_stop;
-   unsigned long long values[NR_CPUS];
-} tsc __initdata = {
-   .start_flag = ATOMIC_INIT(0),
-   .count_start = ATOMIC_INIT(0),
-   .count_stop = ATOMIC_INIT(0),
-};
-
-#define NR_LOOPS 5
-
-static void __init synchronize_tsc_bp(void)
-{
-   int i;
-   unsigned long long t0;
-   unsigned long long sum, avg;
-   long long delta;
-   unsigned int one_usec;
-   int buggy = 0;
-
-   printk(KERN_INFO checking TSC synchronization across %u CPUs: , 
num_booting_cpus());
-
-   /* convert from kcyc/sec to cyc/usec */
-   one_usec = cpu_khz / 1000;
-
-   atomic_set(tsc.start_flag, 1);
-   wmb();
-
-   /*
-* We loop a few times to get a primed instruction cache,
-* then the last pass is more or less synchronized and
-* the BP and APs set their cycle counters to zero all at
-* once. This reduces the chance of having random offsets
-* between the processors, and guarantees that the maximum
-* delay between the cycle counters is never bigger than
-* the latency of information-passing (cachelines) between
-* two CPUs.
-*/
-   for (i = 0; i  NR_LOOPS; i++) {
-   /*
-* all APs synchronize but they loop on '== num_cpus'
-*/
-   while (atomic_read(tsc.count_start) != num_booting_cpus()-1)
-   cpu_relax();
-   atomic_set(tsc.count_stop, 0);
-   wmb();
-   /*
-* this lets the APs save their current TSC:
-*/
- 

Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-27 Thread Christoph Lameter
On Fri, 24 Nov 2006, Andi Kleen wrote:

> The trouble is that people are using the RDTSC anyways even if the
> kernel doesn't. So some synchronization is probably a good idea.

It is better to simply leave TSC alone if unsynchronized. If TSCs appear 
to be in sync (through some sporadic "synchronization") then people will 
be tempted to use RDTSC because it seems to work/ However, RDTSC will 
sporadically yield incoherent values (i.e. time earlier than last TSC 
read) if we have no full synchronization.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-27 Thread Max Krasnyansky



Using gtod() can amount to a substantial disturbance of the thing to
be measured.  Using rdtsc, things seem reliable so far, and we have an
FPGA (accessed through the PCI bus) that has been programmed to give
access to an 8MHz clock and we do some checks against that.


Same here. gettimeofday() is way too slow (dual Opteron box) for the
frequency I need to call it at. HPET is not available. But TSC is doing just
fine. Plus in my case I don't care about sync between CPUs (thread that uses
TSC is running on the isolated CPU) and I have external time source that takes
care of the drift.

So please no trapping of the RDTSC. Making it clear (bold kernel message during
boot :-) that TSC(s) are not in sync or unstable (from GTOD point of view) is of
course perfectly fine.

Thanx
Max
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-27 Thread Robert Crocombe

The difference that Wink reports is tiny compared to that measured on
my Opteron machines:

dual (2.6.17):

[EMAIL PROTECTED]:cyclecounter_test$ ./rdtsc-pref 100
rdtsc:   average ticks=  10
gtod:average ticks=4296
gtod_us: average ticks=4328

quad (2.6.16-rt29):

[EMAIL PROTECTED]:wink_saville_test$ ./rdtsc-pref 100
rdtsc:   average ticks=  10
gtod:average ticks=5688
gtod_us: average ticks=5711

I have my own little test that I'll attach, but it gives a similar
result.  Here are the results from the 2x box:

[EMAIL PROTECTED]:cyclecounter_test$ ./timing
Using the cycle counter
Calibrated timer as 2593081969.758825 Hz
4194304 iterations in 0.016 seconds is 0.004 useconds per iteration.

[EMAIL PROTECTED]:cyclecounter_test$ ./timing_gettimeofday
Using gettimeofday
4194304 iterations in 6.793 seconds is 1.620 useconds per iteration.

I have used the pthread affinity and/or cpuset, etc. mechanisms to try
and inject some reliability into the measurement.

Using gtod() can amount to a substantial disturbance of the thing to
be measured.  Using rdtsc, things seem reliable so far, and we have an
FPGA (accessed through the PCI bus) that has been programmed to give
access to an 8MHz clock and we do some checks against that.

--
Robert Crocombe
[EMAIL PROTECTED]
#include   // printf()
#include  // uint64_t
#include  // drand48()
#include  // select()
#include// gettimeofday
#include 
#include  // rdtscll()



// Globals


enum
{
ITERATIONS  = 1 << 22
};

static double seconds_per_tick;


// Prototypes


double gimme_timeofday(void);
double get_time(void);

void selectsleep(unsigned us);
void init(void);


// Definitions


double
gimme_timeofday(void)
{
struct timeval tv;
gettimeofday(, 0);
return tv.tv_sec + 1e-6 * tv.tv_usec;
}


double
get_time(void)
{
uint64_t t;
rdtscll(t);
return t * seconds_per_tick;
}


/**
A good way to simply hang around doing nothing for awhile.
*/

void
selectsleep(unsigned us)
{
	struct timeval tv;
	tv.tv_sec = 0;
	tv.tv_usec = us;
	select(0,0,0,0,);
}

/**
Figure out how fast rdtscll() ticks.  This should be equal to the
frequency of the clock on the processor.  Here's the bad news: I don't
know if rdtscll() always uses the same processor so it may very well be
necessary to set a processor affinity to get really good results over
time.

This piece of code by Mark Hahn from brain.mcmaster.ca/~hahn/.
*/

void
init(void)
{
	double sumx = 0;
	double sumy = 0;
	double sumxx = 0;
	double sumxy = 0;
	double slope;

	// least squares linear regression of ticks onto real time
	// as returned by gettimeofday.

	const unsigned n = 30;
	unsigned i;

	for ( unsigned int i = 0; i < n; ++i)
{
		double breal,real,ticks;
		uint64_t aticks, bticks;
	
		breal = gimme_timeofday();
		rdtscll(bticks);

		selectsleep((unsigned)(1 + drand48() * 20));

rdtscll(aticks);
		ticks = aticks - bticks;
		real = gimme_timeofday() - breal;

		sumx += real;
		sumxx += real * real;
		sumxy += real * ticks;
		sumy += ticks;
	}

	slope = ((sumxy - (sumx*sumy) / n) / (sumxx - (sumx*sumx) / n));
	seconds_per_tick = 1.0 / slope;

printf("Calibrated timer as %.6f Hz\n", slope);
}

int
main(int argc, char *argv[])
{
printf("Doing stuff\n");

#if 0   // Using rdtscll()
printf("Using the cycle counter\n");
init();
double time_start = gimme_timeofday();
for (unsigned int i = 0; i < ITERATIONS; ++i)
{
double the_time;
the_time = get_time();
}
double time_end = gimme_timeofday();
#else   // using gettimeofday()
printf("Using gettimeofday\n");
double time_start = gimme_timeofday();
for (unsigned int i = 0; i < ITERATIONS; ++i)
{
double the_time;
the_time = gimme_timeofday();
}
double time_end = gimme_timeofday();
#endif

double diff = time_end - time_start;

double useconds = (diff / ITERATIONS) * 1e6;

printf("%u iterations in %.3f seconds is %.3f useconds per iteration.\n",
  ITERATIONS, diff, useconds);

printf("Done\n");
return 0;
}


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-27 Thread Wink Saville

Arjan van de Ven wrote:


just to make sure, you do realize that when you write "ticks" that rdtsc
doesn't measure cpu clock ticks or cpu cycles anymore, right? (At least
not on your machine)



Yes, that's why I wrote ticks and not cycles. At this point I'm not sure how to 
convert
ticks to time on my machine, something else to learn:) Hopefully the HPET
will resolve all of the issues.

Wink

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-27 Thread Wink Saville

Arjan van de Ven wrote:


just to make sure, you do realize that when you write ticks that rdtsc
doesn't measure cpu clock ticks or cpu cycles anymore, right? (At least
not on your machine)



Yes, that's why I wrote ticks and not cycles. At this point I'm not sure how to 
convert
ticks to time on my machine, something else to learn:) Hopefully the HPET
will resolve all of the issues.

Wink

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-27 Thread Robert Crocombe

The difference that Wink reports is tiny compared to that measured on
my Opteron machines:

dual (2.6.17):

[EMAIL PROTECTED]:cyclecounter_test$ ./rdtsc-pref 100
rdtsc:   average ticks=  10
gtod:average ticks=4296
gtod_us: average ticks=4328

quad (2.6.16-rt29):

[EMAIL PROTECTED]:wink_saville_test$ ./rdtsc-pref 100
rdtsc:   average ticks=  10
gtod:average ticks=5688
gtod_us: average ticks=5711

I have my own little test that I'll attach, but it gives a similar
result.  Here are the results from the 2x box:

[EMAIL PROTECTED]:cyclecounter_test$ ./timing
Using the cycle counter
Calibrated timer as 2593081969.758825 Hz
4194304 iterations in 0.016 seconds is 0.004 useconds per iteration.

[EMAIL PROTECTED]:cyclecounter_test$ ./timing_gettimeofday
Using gettimeofday
4194304 iterations in 6.793 seconds is 1.620 useconds per iteration.

I have used the pthread affinity and/or cpuset, etc. mechanisms to try
and inject some reliability into the measurement.

Using gtod() can amount to a substantial disturbance of the thing to
be measured.  Using rdtsc, things seem reliable so far, and we have an
FPGA (accessed through the PCI bus) that has been programmed to give
access to an 8MHz clock and we do some checks against that.

--
Robert Crocombe
[EMAIL PROTECTED]
#include stdio.h  // printf()
#include stdint.h // uint64_t
#include stdlib.h // drand48()
#include sys/select.h // select()
#include sys/time.h   // gettimeofday
#include time.h
#include asm-x86_64/msr.h // rdtscll()



// Globals


enum
{
ITERATIONS  = 1  22
};

static double seconds_per_tick;


// Prototypes


double gimme_timeofday(void);
double get_time(void);

void selectsleep(unsigned us);
void init(void);


// Definitions


double
gimme_timeofday(void)
{
struct timeval tv;
gettimeofday(tv, 0);
return tv.tv_sec + 1e-6 * tv.tv_usec;
}


double
get_time(void)
{
uint64_t t;
rdtscll(t);
return t * seconds_per_tick;
}


/**
A good way to simply hang around doing nothing for awhile.
*/

void
selectsleep(unsigned us)
{
	struct timeval tv;
	tv.tv_sec = 0;
	tv.tv_usec = us;
	select(0,0,0,0,tv);
}

/**
Figure out how fast rdtscll() ticks.  This should be equal to the
frequency of the clock on the processor.  Here's the bad news: I don't
know if rdtscll() always uses the same processor so it may very well be
necessary to set a processor affinity to get really good results over
time.

This piece of code by Mark Hahn from brain.mcmaster.ca/~hahn/.
*/

void
init(void)
{
	double sumx = 0;
	double sumy = 0;
	double sumxx = 0;
	double sumxy = 0;
	double slope;

	// least squares linear regression of ticks onto real time
	// as returned by gettimeofday.

	const unsigned n = 30;
	unsigned i;

	for ( unsigned int i = 0; i  n; ++i)
{
		double breal,real,ticks;
		uint64_t aticks, bticks;
	
		breal = gimme_timeofday();
		rdtscll(bticks);

		selectsleep((unsigned)(1 + drand48() * 20));

rdtscll(aticks);
		ticks = aticks - bticks;
		real = gimme_timeofday() - breal;

		sumx += real;
		sumxx += real * real;
		sumxy += real * ticks;
		sumy += ticks;
	}

	slope = ((sumxy - (sumx*sumy) / n) / (sumxx - (sumx*sumx) / n));
	seconds_per_tick = 1.0 / slope;

printf(Calibrated timer as %.6f Hz\n, slope);
}

int
main(int argc, char *argv[])
{
printf(Doing stuff\n);

#if 0   // Using rdtscll()
printf(Using the cycle counter\n);
init();
double time_start = gimme_timeofday();
for (unsigned int i = 0; i  ITERATIONS; ++i)
{
double the_time;
the_time = get_time();
}
double time_end = gimme_timeofday();
#else   // using gettimeofday()
printf(Using gettimeofday\n);
double time_start = gimme_timeofday();
for (unsigned int i = 0; i  ITERATIONS; ++i)
{
double the_time;
the_time = gimme_timeofday();
}
double time_end = gimme_timeofday();
#endif

double diff = time_end - time_start;

double useconds = (diff / ITERATIONS) * 1e6;

printf(%u iterations in %.3f seconds is %.3f useconds per iteration.\n,
  ITERATIONS, diff, useconds);

printf(Done\n);
return 0;
}


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-27 Thread Max Krasnyansky



Using gtod() can amount to a substantial disturbance of the thing to
be measured.  Using rdtsc, things seem reliable so far, and we have an
FPGA (accessed through the PCI bus) that has been programmed to give
access to an 8MHz clock and we do some checks against that.


Same here. gettimeofday() is way too slow (dual Opteron box) for the
frequency I need to call it at. HPET is not available. But TSC is doing just
fine. Plus in my case I don't care about sync between CPUs (thread that uses
TSC is running on the isolated CPU) and I have external time source that takes
care of the drift.

So please no trapping of the RDTSC. Making it clear (bold kernel message during
boot :-) that TSC(s) are not in sync or unstable (from GTOD point of view) is of
course perfectly fine.

Thanx
Max
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-27 Thread Christoph Lameter
On Fri, 24 Nov 2006, Andi Kleen wrote:

 The trouble is that people are using the RDTSC anyways even if the
 kernel doesn't. So some synchronization is probably a good idea.

It is better to simply leave TSC alone if unsynchronized. If TSCs appear 
to be in sync (through some sporadic synchronization) then people will 
be tempted to use RDTSC because it seems to work/ However, RDTSC will 
sporadically yield incoherent values (i.e. time earlier than last TSC 
read) if we have no full synchronization.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-26 Thread Arjan van de Ven
On Sun, 2006-11-26 at 11:48 -0800, Wink Saville wrote:
> Arjan van de Ven wrote:
> > it's the cost of a syscall (1000 cycles?) plus what it takes to get a
> > reasonable time estimate. Assuming your kernel has enough time support
> > AND your tsc is reasonably ok, it'll be using that. If it's NOT using
> > that then that's a pretty good sign that you can't also use it in
> > userspace
> > 
> 
> I wrote a quick and dirty program that I've attached to test the cost
> difference between RDTSC and gettimeofday (gtod), the results:
> 
> [EMAIL PROTECTED]:~/linux/linux-2.6/test/rdtsc-pref$ time ./rdtsc-pref 
> 1
> rdtsc:   average ticks=  65
> gtod:average ticks= 222
> gtod_us: average ticks= 232

just to make sure, you do realize that when you write "ticks" that rdtsc
doesn't measure cpu clock ticks or cpu cycles anymore, right? (At least
not on your machine)


> But, there are other uses that it wouldn't be acceptable. For instance, I
> have used a memory mapped time stamp counter in an embedded ARM based

ARM is a different animal; generally on such embedded system you know a
lot better if you have a reliable and userspace-useful tick counter like
this

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-26 Thread Wink Saville

Arjan van de Ven wrote:

it's the cost of a syscall (1000 cycles?) plus what it takes to get a
reasonable time estimate. Assuming your kernel has enough time support
AND your tsc is reasonably ok, it'll be using that. If it's NOT using
that then that's a pretty good sign that you can't also use it in
userspace



I wrote a quick and dirty program that I've attached to test the cost
difference between RDTSC and gettimeofday (gtod), the results:

[EMAIL PROTECTED]:~/linux/linux-2.6/test/rdtsc-pref$ time ./rdtsc-pref 1
rdtsc:   average ticks=  65
gtod:average ticks= 222
gtod_us: average ticks= 232

real0m36.002s
user0m35.997s
sys 0m0.000s

About a 3.5x cost difference, still for most of my uses gtod was not as costly
as I had supposed.

But, there are other uses that it wouldn't be acceptable. For instance, I
have used a memory mapped time stamp counter in an embedded ARM based
system for instrumenting the interrupt service routine, syscalls
and task switches. For this type of instrumentation a gtod type call wouldn't
have been suitable.

Anyway for x86_64 systems, if I can use a memory mapped HPET counter, I might be
able to have my cake and eat it too. One counter that can be used inside and
outside the kernel that is cheap, precise and accurate, nirvana! We'll have to 
see.

BTW my system is a 2.4ghz Core 2 duo running 2.6.19-rc6 with HPET enabled,
in the attachment I've included my config file.

Cheers,

Wink



rdtsc-pref.tgz
Description: Binary data


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-26 Thread Arjan van de Ven
On Sun, 2006-11-26 at 00:16 -0800, Wink Saville wrote:
> Robert Hancock wrote:
>  Actually, we need to ask the CPU/System makers to provide a system wide
> > Generally user mode code should just be using gettimeofday. When the TSC 
> > is usable as a sane time source, the kernel will use it. When it's not, 
> > it will use something else like the HPET, ACPI PM Timer or (at last 
> > resort) the PIT, in increasing degrees of slowness.
> > 
> 
> But gettimeofday is much too expensive compared to RDTSC.

it's the cost of a syscall (1000 cycles?) plus what it takes to get a
reasonable time estimate. Assuming your kernel has enough time support
AND your tsc is reasonably ok, it'll be using that. If it's NOT using
that then that's a pretty good sign that you can't also use it in
userspace

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-26 Thread Wink Saville

Robert Hancock wrote:

Actually, we need to ask the CPU/System makers to provide a system wide
Generally user mode code should just be using gettimeofday. When the TSC 
is usable as a sane time source, the kernel will use it. When it's not, 
it will use something else like the HPET, ACPI PM Timer or (at last 
resort) the PIT, in increasing degrees of slowness.




But gettimeofday is much too expensive compared to RDTSC.

Wink

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-26 Thread Wink Saville

Robert Hancock wrote:

Actually, we need to ask the CPU/System makers to provide a system wide
Generally user mode code should just be using gettimeofday. When the TSC 
is usable as a sane time source, the kernel will use it. When it's not, 
it will use something else like the HPET, ACPI PM Timer or (at last 
resort) the PIT, in increasing degrees of slowness.




But gettimeofday is much too expensive compared to RDTSC.

Wink

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-26 Thread Arjan van de Ven
On Sun, 2006-11-26 at 00:16 -0800, Wink Saville wrote:
 Robert Hancock wrote:
  Actually, we need to ask the CPU/System makers to provide a system wide
  Generally user mode code should just be using gettimeofday. When the TSC 
  is usable as a sane time source, the kernel will use it. When it's not, 
  it will use something else like the HPET, ACPI PM Timer or (at last 
  resort) the PIT, in increasing degrees of slowness.
  
 
 But gettimeofday is much too expensive compared to RDTSC.

it's the cost of a syscall (1000 cycles?) plus what it takes to get a
reasonable time estimate. Assuming your kernel has enough time support
AND your tsc is reasonably ok, it'll be using that. If it's NOT using
that then that's a pretty good sign that you can't also use it in
userspace

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-26 Thread Wink Saville

Arjan van de Ven wrote:

it's the cost of a syscall (1000 cycles?) plus what it takes to get a
reasonable time estimate. Assuming your kernel has enough time support
AND your tsc is reasonably ok, it'll be using that. If it's NOT using
that then that's a pretty good sign that you can't also use it in
userspace



I wrote a quick and dirty program that I've attached to test the cost
difference between RDTSC and gettimeofday (gtod), the results:

[EMAIL PROTECTED]:~/linux/linux-2.6/test/rdtsc-pref$ time ./rdtsc-pref 1
rdtsc:   average ticks=  65
gtod:average ticks= 222
gtod_us: average ticks= 232

real0m36.002s
user0m35.997s
sys 0m0.000s

About a 3.5x cost difference, still for most of my uses gtod was not as costly
as I had supposed.

But, there are other uses that it wouldn't be acceptable. For instance, I
have used a memory mapped time stamp counter in an embedded ARM based
system for instrumenting the interrupt service routine, syscalls
and task switches. For this type of instrumentation a gtod type call wouldn't
have been suitable.

Anyway for x86_64 systems, if I can use a memory mapped HPET counter, I might be
able to have my cake and eat it too. One counter that can be used inside and
outside the kernel that is cheap, precise and accurate, nirvana! We'll have to 
see.

BTW my system is a 2.4ghz Core 2 duo running 2.6.19-rc6 with HPET enabled,
in the attachment I've included my config file.

Cheers,

Wink



rdtsc-pref.tgz
Description: Binary data


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-26 Thread Arjan van de Ven
On Sun, 2006-11-26 at 11:48 -0800, Wink Saville wrote:
 Arjan van de Ven wrote:
  it's the cost of a syscall (1000 cycles?) plus what it takes to get a
  reasonable time estimate. Assuming your kernel has enough time support
  AND your tsc is reasonably ok, it'll be using that. If it's NOT using
  that then that's a pretty good sign that you can't also use it in
  userspace
  
 
 I wrote a quick and dirty program that I've attached to test the cost
 difference between RDTSC and gettimeofday (gtod), the results:
 
 [EMAIL PROTECTED]:~/linux/linux-2.6/test/rdtsc-pref$ time ./rdtsc-pref 
 1
 rdtsc:   average ticks=  65
 gtod:average ticks= 222
 gtod_us: average ticks= 232

just to make sure, you do realize that when you write ticks that rdtsc
doesn't measure cpu clock ticks or cpu cycles anymore, right? (At least
not on your machine)


 But, there are other uses that it wouldn't be acceptable. For instance, I
 have used a memory mapped time stamp counter in an embedded ARM based

ARM is a different animal; generally on such embedded system you know a
lot better if you have a reliable and userspace-useful tick counter like
this

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-25 Thread Robert Hancock

Wink Saville wrote:

Arjan van de Ven wrote:

Actually, we need to ask the CPU/System makers to provide a system wide
timer that is independent of the given CPU. I would expect it quite 
simple


they exist. They're called pmtimer and hpet.
pmtimer is port io. hpet is memory mapped io.


Thanks for the info. I took a look at Documentation/hpet.txt and 
drivers/char/hpet.c
and see that hpet_mmap is implemented in the driver but nothing hpet.txt 
indicates

what is being mapped.

Could you point me to any other documentation? I did find the following:

http://www.intel.com/hardwaredesign/hpetspec_1.pdf

Are you aware of any example user code that uses the mmap capability of 
hpet?


Generally user mode code should just be using gettimeofday. When the TSC 
is usable as a sane time source, the kernel will use it. When it's not, 
it will use something else like the HPET, ACPI PM Timer or (at last 
resort) the PIT, in increasing degrees of slowness.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-25 Thread Robert Hancock

Wink Saville wrote:

Arjan van de Ven wrote:

Actually, we need to ask the CPU/System makers to provide a system wide
timer that is independent of the given CPU. I would expect it quite 
simple


they exist. They're called pmtimer and hpet.
pmtimer is port io. hpet is memory mapped io.


Thanks for the info. I took a look at Documentation/hpet.txt and 
drivers/char/hpet.c
and see that hpet_mmap is implemented in the driver but nothing hpet.txt 
indicates

what is being mapped.

Could you point me to any other documentation? I did find the following:

http://www.intel.com/hardwaredesign/hpetspec_1.pdf

Are you aware of any example user code that uses the mmap capability of 
hpet?


Generally user mode code should just be using gettimeofday. When the TSC 
is usable as a sane time source, the kernel will use it. When it's not, 
it will use something else like the HPET, ACPI PM Timer or (at last 
resort) the PIT, in increasing degrees of slowness.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/