[XenPPC] [xenppc-unstable] [XEN][POWERPC] Normalize timbase_freq to a 64bit value
# HG changeset patch # User Jimi Xenidis [EMAIL PROTECTED] # Node ID 20bd3b7b7519e01f7b6bfa97c7a655e1dc027f5d # Parent 887e1cbac6154da0a3a3c2433fbc5b0fc2a1c9b8 [XEN][POWERPC] Normalize timbase_freq to a 64bit value Signed-off-by: Jimi Xenidis [EMAIL PROTECTED] --- xen/arch/powerpc/boot_of.c |7 --- xen/arch/powerpc/time.c|2 +- xen/include/asm-powerpc/time.h |2 +- 3 files changed, 6 insertions(+), 5 deletions(-) diff -r 887e1cbac615 -r 20bd3b7b7519 xen/arch/powerpc/boot_of.c --- a/xen/arch/powerpc/boot_of.cMon Dec 11 20:50:32 2006 -0500 +++ b/xen/arch/powerpc/boot_of.cFri Dec 15 08:16:56 2006 -0500 @@ -1159,6 +1159,7 @@ static int __init boot_of_cpus(void) s32 cpuid; u32 cpu_clock[2]; extern uint cpu_hard_id[NR_CPUS]; +u32 tbf; /* Look up which CPU we are running on right now and get all info * from there */ @@ -1173,12 +1174,12 @@ static int __init boot_of_cpus(void) cpu_node = bootcpu_node; -result = of_getprop(cpu_node, timebase-frequency, timebase_freq, -sizeof(timebase_freq)); +result = of_getprop(cpu_node, timebase-frequency, tbf, sizeof(tbf)); +timebase_freq = tbf; if (result == OF_FAILURE) { of_panic(Couldn't get timebase frequency!\n); } -of_printf(OF: timebase-frequency = %d Hz\n, timebase_freq); +of_printf(OF: timebase-frequency = %ld Hz\n, timebase_freq); result = of_getprop(cpu_node, clock-frequency, cpu_clock, sizeof(cpu_clock)); diff -r 887e1cbac615 -r 20bd3b7b7519 xen/arch/powerpc/time.c --- a/xen/arch/powerpc/time.c Mon Dec 11 20:50:32 2006 -0500 +++ b/xen/arch/powerpc/time.c Fri Dec 15 08:16:56 2006 -0500 @@ -32,7 +32,7 @@ static int cpu_has_hdec = 1; static int cpu_has_hdec = 1; ulong ticks_per_usec; unsigned long cpu_khz; -unsigned int timebase_freq; +s64 timebase_freq; s_time_t get_s_time(void) { diff -r 887e1cbac615 -r 20bd3b7b7519 xen/include/asm-powerpc/time.h --- a/xen/include/asm-powerpc/time.hMon Dec 11 20:50:32 2006 -0500 +++ b/xen/include/asm-powerpc/time.hFri Dec 15 08:16:56 2006 -0500 @@ -27,7 +27,7 @@ #include xen/percpu.h #include asm/processor.h -extern unsigned int timebase_freq; +extern s64 timebase_freq; #define CLOCK_TICK_RATE timebase_freq #define watchdog_disable() ((void)0) ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] [xenppc-unstable] [XEN][POWERPC] workaround for context_switch() bug
# HG changeset patch # User Jimi Xenidis [EMAIL PROTECTED] # Node ID 6af601c5ebe192a0de72430cdd94da5ba46ff287 # Parent 20bd3b7b7519e01f7b6bfa97c7a655e1dc027f5d [XEN][POWERPC] workaround for context_switch() bug We have a bug in that if we switch domains in schedule() we switch right away regardless of whatever else is pending. This means that if the timer goes off while in schedule(), the next domain will be preempted by the interval defined below. So until we fix our cotnext_switch(), the follow workaround will make sure that the domain we switch to does not run for to long so we can continue to service the other timers in the timer queue and that the value is long enough to escape this particular timer event. Signed-off-by: Jimi Xenidis [EMAIL PROTECTED] --- xen/arch/powerpc/exceptions.c | 19 +-- 1 files changed, 17 insertions(+), 2 deletions(-) diff -r 20bd3b7b7519 -r 6af601c5ebe1 xen/arch/powerpc/exceptions.c --- a/xen/arch/powerpc/exceptions.c Fri Dec 15 08:16:56 2006 -0500 +++ b/xen/arch/powerpc/exceptions.c Fri Dec 15 08:36:03 2006 -0500 @@ -35,7 +35,9 @@ extern ulong ppc_do_softirq(ulong orig_m extern ulong ppc_do_softirq(ulong orig_msr); extern void do_timer(struct cpu_user_regs *regs); extern void do_dec(struct cpu_user_regs *regs); -extern void program_exception(struct cpu_user_regs *regs, unsigned long cookie); +extern void program_exception(struct cpu_user_regs *regs, + unsigned long cookie); +extern int reprogram_timer(s_time_t timeout); int hdec_sample = 0; @@ -43,7 +45,20 @@ void do_timer(struct cpu_user_regs *regs { /* Set HDEC high so it stops firing and can be reprogrammed by * set_preempt() */ -mthdec(INT_MAX); +/* FIXME! HACK ALERT! + * + * We have a bug in that if we switch domains in schedule() we + * switch right away regardless of whatever else is pending. This + * means that if the timer goes off while in schedule(), the next + * domain will be preempted by the interval defined below. So + * until we fix our cotnext_switch(), the follow workaround will + * make sure that the domain we switch to does not run for to long + * so we can continue to service the other timers in the timer + * queue and that the value is long enough to escape this + * particular timer event. + */ +reprogram_timer(NOW() + MILLISECS(1)); + raise_softirq(TIMER_SOFTIRQ); } ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] copy_page speedup using dcbz on target
Using dcbz avoids first reading a cache line from memory before writing to the line. Timing results (starting with clean cache, ie no write-backs for dirty lines): JS20: elapsed time: 0x9f5e elapsed time using dcbz: 0x569e elapsed time: 0x9fe9 elapsed time using dcbz: 0x5765 JS21: elapsed time: 0x089e elapsed time using dcbz: 0x0439 elapsed time: 0x0886 elapsed time using dcbz: 0x0438 . #include stdio.h #include stdlib.h #include string.h #include errno.h typedef unsigned char uchar; typedef unsigned long ulong; #define LINE_SIZE 128 #define PAGE_SIZE 0x1000 #define BUF1_SIZE (PAGE_SIZE * 64) #define BUF2_SIZE (PAGE_SIZE) #define BUF3_SIZE (0x80) static __inline__ ulong time_base(void); static __inline__ void copy_page(void *dp, void *sp); static __inline__ void cacheable_copy_page(void *dp, void *sp); static __inline__ void cacheable_clear_page(void *addr); static uchar clean_cache(uchar *buf3); int main(int argc, char **argv){ int i; ulong tb1, tb2; uchar *buf1, *buf2, *buf3, *bufp; buf1 = malloc(BUF1_SIZE + PAGE_SIZE); buf2 = malloc(BUF2_SIZE + PAGE_SIZE); buf3 = malloc(BUF3_SIZE + PAGE_SIZE); buf1 = (uchar *)((ulong)(buf1 + (PAGE_SIZE - 1)) ~(PAGE_SIZE - 1)); buf2 = (uchar *)((ulong)(buf2 + (PAGE_SIZE - 1)) ~(PAGE_SIZE - 1)); buf3 = (uchar *)((ulong)(buf3 + (PAGE_SIZE - 1)) ~(PAGE_SIZE - 1)); memset(buf1, 1, BUF1_SIZE); memset(buf2, 2, BUF2_SIZE); memset(buf3, 3, BUF3_SIZE); clean_cache(buf3); tb1 = time_base(); for (bufp = buf1, i = 0; i 4; i++, bufp += PAGE_SIZE*16){ copy_page(bufp, buf2); copy_page(bufp+(PAGE_SIZE*1), buf2); copy_page(bufp+(PAGE_SIZE*2), buf2); copy_page(bufp+(PAGE_SIZE*3), buf2); copy_page(bufp+(PAGE_SIZE*4), buf2); copy_page(bufp+(PAGE_SIZE*5), buf2); copy_page(bufp+(PAGE_SIZE*6), buf2); copy_page(bufp+(PAGE_SIZE*7), buf2); copy_page(bufp+(PAGE_SIZE*8), buf2); copy_page(bufp+(PAGE_SIZE*9), buf2); copy_page(bufp+(PAGE_SIZE*10), buf2); copy_page(bufp+(PAGE_SIZE*11), buf2); copy_page(bufp+(PAGE_SIZE*12), buf2); copy_page(bufp+(PAGE_SIZE*13), buf2); copy_page(bufp+(PAGE_SIZE*14), buf2); copy_page(bufp+(PAGE_SIZE*15), buf2); } tb2 = time_base(); printf(elapsed time: 0x%016lx\n, tb2 - tb1); clean_cache(buf3); tb1 = time_base(); for (bufp = buf1, i = 0; i 4; i++, bufp += PAGE_SIZE*16){ cacheable_copy_page(bufp, buf2); cacheable_copy_page(bufp+(PAGE_SIZE*1), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*2), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*3), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*4), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*5), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*6), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*7), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*8), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*9), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*10), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*11), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*12), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*13), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*14), buf2); cacheable_copy_page(bufp+(PAGE_SIZE*15), buf2); } tb2 = time_base(); printf(elapsed time using dcbz: 0x%016lx\n, tb2 - tb1); return(0); } static __inline__ ulong time_base(void) { ulong tb; __asm__ __volatile__( mftb %0 # read time base : =r (tb)); return tb; } static __inline__ void cacheable_clear_page(void *addr) { ulong lines, line_size; line_size = LINE_SIZE; lines = PAGE_SIZE / line_size; __asm__ __volatile__( mtctr %1 # clear_page\n\ 1: dcbz0,%0\n\ add %0,%0,%3\n\ bdnz1b : =r (addr) : r (lines), 0 (addr), r (line_size) : %ctr, memory); } static __inline__ void copy_page(void *dp, void *sp) { ulong dwords, dword_size; dword_size = 8; dwords = (PAGE_SIZE / dword_size) - 1; __asm__ __volatile__( mtctr %2 # copy_page\n\ ld %2,0(%1)\n\ std %2,0(%0)\n\ 1: ldu %2,8(%1)\n\ stdu%2,8(%0)\n\ bdnz1b : /* no result */ : r (dp), r (sp), r (dwords) : %ctr, memory); } static __inline__ void cacheable_copy_page(void *dp, void *sp) { cacheable_clear_page(dp); copy_page(dp, sp); } static uchar clean_cache(uchar *buf3) { int i; uchar uc, *ucp = buf3; for (i = 0; i BUF3_SIZE / LINE_SIZE; i++){ uc += *ucp; ucp += LINE_SIZE; } return(uc); } ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com
[XenPPC] schedule() vs softirqs
PowerPC's timer interrupt (called the decrementer) is a one-shot timer, not periodic. When it goes off, entering the hypervisor, we first set it very high so it won't interrupt hypervisor code, then raise_softirq(TIMER_SOFTIRQ). We know that timer_softirq_action() will then call reprogam_timer(), which will reprogram the decrementer to the appropriate value. We recently uncovered a bug on PowerPC where if a timer tick arrives just inside schedule() while interrupts are still enabled, the decrementer is never reprogrammed to that appropriate value. This is because once inside schedule(), we never handle any subsequent softirqs: we call context_switch() and resume the guest. I believe the tick problem affects periodic timers (i.e. x86) as well, though less drastically. With a CPU-bound guest, it would result in dropped ticks: TIMER_SOFTIRQ is set and not handled, and when the timer expires again it is re-set. In other cases, it would result in some timer ticks being delivered very late. I don't know what effect this might have on guests, perhaps with sensitive time-slewing code. In addition, when SCHEDULE_SOFTIRQ is set, all greater softirqs (including NMI) will not be handled until the next hypervisor invocation. This is pretty anti-social behavior for a softirq handler. One solution would be to have schedule() *not* call context_switch() directly, but rather set a flag (or a next vcpu pointer) and return. That would allow other softirqs to be processed normally. Once do_softirq() returns to assembly, we can check the next vcpu pointer and call context_switch(). (This solution would enable a PowerPC optimization as well: we would like to lazily save non-volatile registers. We can't do this unless the exception handler regains control from do_softirq() before context_switch() is called.) Thoughts? -- Hollis Blanchard IBM Linux Technology Center ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] Re: [Xen-devel] schedule() vs softirqs
On 15/12/06 17:27, Hollis Blanchard [EMAIL PROTECTED] wrote: We recently uncovered a bug on PowerPC where if a timer tick arrives just inside schedule() while interrupts are still enabled, the decrementer is never reprogrammed to that appropriate value. This is because once inside schedule(), we never handle any subsequent softirqs: we call context_switch() and resume the guest. Easily fixed. You need to handle softirqs in the exit path to guest context. You need to do this final check with interrupts disabled to avoid races. -- Keir ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] Machine check: instruction-fetch TLB tablewalk
Just saw this on a JS21 blade (internal name is cso52): (XEN) MACHINE CHECK: IS Recoverable (XEN) [ Xen-3.0-unstable ] (XEN) CPU: DOMID: (XEN) pc 10046510 msr 000cf032 (XEN) lr 10063bf4 ctr 0fde93c0 (XEN) srr0 srr1 (XEN) r00: 10063be4 ffda9710 f7fcd470 0003 (XEN) r04: 100a9268 0001 0030 fefefeff (XEN) r08: 0fecb4d8 100a 0019 (XEN) r12: 28242424 100a3300 1002 (XEN) r16: 100a 100a9008 100a 1008 (XEN) r20: 1006 (XEN) r24: 0008 100a9df8 100a (XEN) r28: 100a9268 100a 10063bc0 (XEN) dar 0xffda9628, dsisr 0x0220 (XEN) hid4 0x (XEN) ---[ backtrace ]--- (XEN) SP (ffda9710) is not in xen space (XEN) SRR1: 0x000cf032 (XEN) 0b11: Exception caused by a hardware uncorrectable (XEN) error (UE) detected while doing a reload of an (XEN) instruction-fetch TLB tablewalk. (XEN) (XEN) DSISR: 0x0220 (XEN) program_exception: machine check (XEN) machine_halt called: spinning (XEN) machine_halt called ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] Re: [Xen-devel] schedule() vs softirqs
On Fri, 2006-12-15 at 17:36 +, Keir Fraser wrote: On 15/12/06 17:27, Hollis Blanchard [EMAIL PROTECTED] wrote: We recently uncovered a bug on PowerPC where if a timer tick arrives just inside schedule() while interrupts are still enabled, the decrementer is never reprogrammed to that appropriate value. This is because once inside schedule(), we never handle any subsequent softirqs: we call context_switch() and resume the guest. Easily fixed. You need to handle softirqs in the exit path to guest context. You need to do this final check with interrupts disabled to avoid races. Ah OK, I see now how x86 is doing that. I don't think that code flow really makes sense: why would you jump out of do_softirq() into assembly just to call do_softirq() again? Also, that doesn't solve the lazy register saving problem. However, I think I see how we can implement our desired context_switch() scheme in arch-specific code. The context_switch() call in schedule() will return, so please don't add a BUG() after that. :) -- Hollis Blanchard IBM Linux Technology Center ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] copy_page speedup using dcbz on target
On Fri, 2006-12-15 at 11:50 -0500, poff wrote: Using dcbz avoids first reading a cache line from memory before writing to the line. Timing results (starting with clean cache, ie no write-backs for dirty lines): So do you have a patch for copy_page()? -- Hollis Blanchard IBM Linux Technology Center ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] [xenppc-unstable] [POWERPC][XEN] Add support for || as a xen/dom0 commandline divider.
# HG changeset patch # User Hollis Blanchard [EMAIL PROTECTED] # Node ID dbc74db14a4b39d359365fcf8257216d968fa269 # Parent 887e1cbac6154da0a3a3c2433fbc5b0fc2a1c9b8 [POWERPC][XEN] Add support for || as a xen/dom0 commandline divider. Signed-off-by: Jerone Young [EMAIL PROTECTED] Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] --- xen/arch/powerpc/boot_of.c | 16 1 files changed, 12 insertions(+), 4 deletions(-) diff -r 887e1cbac615 -r dbc74db14a4b xen/arch/powerpc/boot_of.c --- a/xen/arch/powerpc/boot_of.cMon Dec 11 20:50:32 2006 -0500 +++ b/xen/arch/powerpc/boot_of.cTue Dec 12 14:35:07 2006 -0600 @@ -1070,10 +1070,11 @@ static void * __init boot_of_module(ulon static module_t mods[4]; ulong mod0_start; ulong mod0_size; -static const char sepr[] = -- ; +static const char * sepr[] = { -- , || }; +int sepr_index; extern char dom0_start[] __attribute__ ((weak)); extern char dom0_size[] __attribute__ ((weak)); -const char *p; +const char *p = NULL; int mod; void *oft; @@ -1124,11 +1125,18 @@ static void * __init boot_of_module(ulon of_printf(%s: dom0 mod @ 0x%016x[0x%x]\n, __func__, mods[mod].mod_start, mods[mod].mod_end); -p = strstr((char *)(ulong)mbi-cmdline, sepr); + +/* look for delimiter: -- or || */ +for (sepr_index = 0; sepr_index ARRAY_SIZE(sepr); sepr_index++){ +p = strstr((char *)(ulong)mbi-cmdline, sepr[sepr_index]); +if (p != NULL) +break; +} + if (p != NULL) { /* Xen proper should never know about the dom0 args. */ *(char *)p = '\0'; -p += sizeof (sepr) - 1; +p += strlen(sepr[sepr_index]); mods[mod].string = (u32)(ulong)p; of_printf(%s: dom0 mod string: %s\n, __func__, p); } ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] [xenppc-unstable] [POWERPC] merge
# HG changeset patch # User Hollis Blanchard [EMAIL PROTECTED] # Node ID 9a758f814f60166dcf4a386bb9835f58c8f68502 # Parent dbc74db14a4b39d359365fcf8257216d968fa269 # Parent 6af601c5ebe192a0de72430cdd94da5ba46ff287 [POWERPC] merge Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] --- xen/arch/powerpc/boot_of.c |7 --- xen/arch/powerpc/exceptions.c | 19 +-- xen/arch/powerpc/time.c|2 +- xen/include/asm-powerpc/time.h |2 +- 4 files changed, 23 insertions(+), 7 deletions(-) diff -r dbc74db14a4b -r 9a758f814f60 xen/arch/powerpc/boot_of.c --- a/xen/arch/powerpc/boot_of.cTue Dec 12 14:35:07 2006 -0600 +++ b/xen/arch/powerpc/boot_of.cFri Dec 15 13:37:38 2006 -0600 @@ -1167,6 +1167,7 @@ static int __init boot_of_cpus(void) s32 cpuid; u32 cpu_clock[2]; extern uint cpu_hard_id[NR_CPUS]; +u32 tbf; /* Look up which CPU we are running on right now and get all info * from there */ @@ -1181,12 +1182,12 @@ static int __init boot_of_cpus(void) cpu_node = bootcpu_node; -result = of_getprop(cpu_node, timebase-frequency, timebase_freq, -sizeof(timebase_freq)); +result = of_getprop(cpu_node, timebase-frequency, tbf, sizeof(tbf)); +timebase_freq = tbf; if (result == OF_FAILURE) { of_panic(Couldn't get timebase frequency!\n); } -of_printf(OF: timebase-frequency = %d Hz\n, timebase_freq); +of_printf(OF: timebase-frequency = %ld Hz\n, timebase_freq); result = of_getprop(cpu_node, clock-frequency, cpu_clock, sizeof(cpu_clock)); diff -r dbc74db14a4b -r 9a758f814f60 xen/arch/powerpc/exceptions.c --- a/xen/arch/powerpc/exceptions.c Tue Dec 12 14:35:07 2006 -0600 +++ b/xen/arch/powerpc/exceptions.c Fri Dec 15 13:37:38 2006 -0600 @@ -35,7 +35,9 @@ extern ulong ppc_do_softirq(ulong orig_m extern ulong ppc_do_softirq(ulong orig_msr); extern void do_timer(struct cpu_user_regs *regs); extern void do_dec(struct cpu_user_regs *regs); -extern void program_exception(struct cpu_user_regs *regs, unsigned long cookie); +extern void program_exception(struct cpu_user_regs *regs, + unsigned long cookie); +extern int reprogram_timer(s_time_t timeout); int hdec_sample = 0; @@ -43,7 +45,20 @@ void do_timer(struct cpu_user_regs *regs { /* Set HDEC high so it stops firing and can be reprogrammed by * set_preempt() */ -mthdec(INT_MAX); +/* FIXME! HACK ALERT! + * + * We have a bug in that if we switch domains in schedule() we + * switch right away regardless of whatever else is pending. This + * means that if the timer goes off while in schedule(), the next + * domain will be preempted by the interval defined below. So + * until we fix our cotnext_switch(), the follow workaround will + * make sure that the domain we switch to does not run for to long + * so we can continue to service the other timers in the timer + * queue and that the value is long enough to escape this + * particular timer event. + */ +reprogram_timer(NOW() + MILLISECS(1)); + raise_softirq(TIMER_SOFTIRQ); } diff -r dbc74db14a4b -r 9a758f814f60 xen/arch/powerpc/time.c --- a/xen/arch/powerpc/time.c Tue Dec 12 14:35:07 2006 -0600 +++ b/xen/arch/powerpc/time.c Fri Dec 15 13:37:38 2006 -0600 @@ -32,7 +32,7 @@ static int cpu_has_hdec = 1; static int cpu_has_hdec = 1; ulong ticks_per_usec; unsigned long cpu_khz; -unsigned int timebase_freq; +s64 timebase_freq; s_time_t get_s_time(void) { diff -r dbc74db14a4b -r 9a758f814f60 xen/include/asm-powerpc/time.h --- a/xen/include/asm-powerpc/time.hTue Dec 12 14:35:07 2006 -0600 +++ b/xen/include/asm-powerpc/time.hFri Dec 15 13:37:38 2006 -0600 @@ -27,7 +27,7 @@ #include xen/percpu.h #include asm/processor.h -extern unsigned int timebase_freq; +extern s64 timebase_freq; #define CLOCK_TICK_RATE timebase_freq #define watchdog_disable() ((void)0) ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] Re: [Xen-devel] schedule() vs softirqs
On 15/12/06 20:41, Hollis Blanchard [EMAIL PROTECTED] wrote: It's an issue with any architecture with a large number of registers which aren't automatically saved by hardware (and a C ABI that makes some of them non-volatile). x86 has a small number of registers. ia64 automatically saves them (from what I understand). So of the currently-supported architectures, yes, that leaves PowerPC. I see. It sounds like returning from context_switch() is perhaps the right thing for powerpc. That would be easier if you have per-cpu stacks (like ia64). If not there are issues in saving register state later (and hence delaying your call to context_saved()) as there are calls to do_softirq() outside your asm code (well, not many, but there is one in domain.c for example) where you won't end up executing your do_softirq() wrapper. In general we'd like to reserve the right to include voluntary yield points, and that won't mix well with lazy register saves and per-physical-cpu stacks. -- Keir ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] copy_page speedup using dcbz on target
So do you have a patch for copy_page()? In Xen for PPC, the only copy_page() is in arch/powerpc/mm.c: extern void copy_page(void *dp, void *sp) { if (on_systemsim()) { systemsim_memcpy(dp, sp, PAGE_SIZE); } else { memcpy(dp, sp, PAGE_SIZE); } } 1) Also copy_page is not referenced in current Xen sources? 2) dcbz depends on cacheability and cache alignment. Should a newname be given to this version of copy_page()? 3) Useful when PPC must do page copies in place of 'page flipping'. ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] copy_page speedup using dcbz on target
On Fri, 2006-12-15 at 16:40 -0500, poff wrote: So do you have a patch for copy_page()? In Xen for PPC, the only copy_page() is in arch/powerpc/mm.c: extern void copy_page(void *dp, void *sp) { if (on_systemsim()) { systemsim_memcpy(dp, sp, PAGE_SIZE); } else { memcpy(dp, sp, PAGE_SIZE); } } Correct. 1) Also copy_page is not referenced in current Xen sources? In that case, why are you playing with it? 2) dcbz depends on cacheability and cache alignment. Should a newname be given to this version of copy_page()? page indicates cacheline-aligned. Who calls copy_page() with non-cacheable memory? 3) Useful when PPC must do page copies in place of 'page flipping'. So you're saying we should worry about it later? -- Hollis Blanchard IBM Linux Technology Center ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] copy_page speedup using dcbz on target
3) Useful when PPC must do page copies in place of 'page flipping'. So you're saying we should worry about it later? For the future, copy_page using dcbz: diff -r 7669fca80bfc xen/arch/powerpc/mm.c --- a/xen/arch/powerpc/mm.c Mon Dec 04 11:46:53 2006 -0500 +++ b/xen/arch/powerpc/mm.c Fri Dec 15 17:52:58 2006 -0500 @@ -280,7 +280,8 @@ extern void copy_page(void *dp, void *sp if (on_systemsim()) { systemsim_memcpy(dp, sp, PAGE_SIZE); } else { -memcpy(dp, sp, PAGE_SIZE); + clear_page(dp); + __copy_page(dp, sp); } } diff -r 7669fca80bfc xen/include/asm-powerpc/page.h --- a/xen/include/asm-powerpc/page.hMon Dec 04 11:46:53 2006 -0500 +++ b/xen/include/asm-powerpc/page.hFri Dec 15 17:52:58 2006 -0500 @@ -90,6 +90,25 @@ 1: dcbz0,%0\n\ extern void copy_page(void *dp, void *sp); +static __inline__ void __copy_page(void *dp, void *sp) +{ + ulong dwords, dword_size; + + dword_size = 8; + dwords = (PAGE_SIZE / dword_size) - 1; + + __asm__ __volatile__( + mtctr %2 # copy_page\n\ + ld %2,0(%1)\n\ + std %2,0(%0)\n\ +1: ldu %2,8(%1)\n\ + stdu%2,8(%0)\n\ + bdnz1b + : /* no result */ + : r (dp), r (sp), r (dwords) + : %ctr, memory); +} + #define linear_pg_table linear_l1_table static inline int get_order(unsigned long size) ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] copy_page speedup using dcbz on target
On Fri, 2006-12-15 at 17:50 -0500, poff wrote: 3) Useful when PPC must do page copies in place of 'page flipping'. So you're saying we should worry about it later? For the future, copy_page using dcbz: diff -r 7669fca80bfc xen/arch/powerpc/mm.c --- a/xen/arch/powerpc/mm.c Mon Dec 04 11:46:53 2006 -0500 +++ b/xen/arch/powerpc/mm.c Fri Dec 15 17:52:58 2006 -0500 @@ -280,7 +280,8 @@ extern void copy_page(void *dp, void *sp if (on_systemsim()) { systemsim_memcpy(dp, sp, PAGE_SIZE); } else { -memcpy(dp, sp, PAGE_SIZE); + clear_page(dp); + __copy_page(dp, sp); } } diff -r 7669fca80bfc xen/include/asm-powerpc/page.h --- a/xen/include/asm-powerpc/page.h Mon Dec 04 11:46:53 2006 -0500 +++ b/xen/include/asm-powerpc/page.h Fri Dec 15 17:52:58 2006 -0500 @@ -90,6 +90,25 @@ 1: dcbz0,%0\n\ extern void copy_page(void *dp, void *sp); +static __inline__ void __copy_page(void *dp, void *sp) +{ + ulong dwords, dword_size; + + dword_size = 8; + dwords = (PAGE_SIZE / dword_size) - 1; + + __asm__ __volatile__( + mtctr %2 # copy_page\n\ + ld %2,0(%1)\n\ + std %2,0(%0)\n\ +1: ldu %2,8(%1)\n\ + stdu%2,8(%0)\n\ + bdnz1b + : /* no result */ + : r (dp), r (sp), r (dwords) + : %ctr, memory); +} + I'd rather have copy_page() dcbz; stdu; stdu; stdu; ... stdu; in each loop iteration. It would also be nice to improve memcpy, though that one is certainly more difficult due to alignment, varying lengths, etc. Perhaps we can borrow code from http://penguinppc.org/dev/glibc/glibc-powerpc-cpu-addon.html -- Hollis Blanchard IBM Linux Technology Center ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] Re: [Xen-devel] schedule() vs softirqs
On Fri, 2006-12-15 at 21:39 +, Keir Fraser wrote: On 15/12/06 20:41, Hollis Blanchard [EMAIL PROTECTED] wrote: It's an issue with any architecture with a large number of registers which aren't automatically saved by hardware (and a C ABI that makes some of them non-volatile). x86 has a small number of registers. ia64 automatically saves them (from what I understand). So of the currently-supported architectures, yes, that leaves PowerPC. I see. It sounds like returning from context_switch() is perhaps the right thing for powerpc. That would be easier if you have per-cpu stacks (like ia64). Yup, we have per-cpu stacks. If not there are issues in saving register state later (and hence delaying your call to context_saved()) as there are calls to do_softirq() outside your asm code (well, not many, but there is one in domain.c for example) where you won't end up executing your do_softirq() wrapper. In general we'd like to reserve the right to include voluntary yield points, and that won't mix well with lazy register saves and per-physical-cpu stacks. Oh, we have per-physical-cpu stacks. We can do that because there's no such thing as a hypervisor thread which could block in hypervisor space and need to be restored later. Are you saying in the future you want to have hypervisor threads, and so we'll need per-VIRTUAL-cpu stacks? -- Hollis Blanchard IBM Linux Technology Center ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel