Re: [RFC] sched/eevdf: sched feature to dismiss lag on wakeup
On 3/20/24 07:04, Tobias Huschle wrote: > On Tue, Mar 19, 2024 at 02:41:14PM +0100, Vincent Guittot wrote: >> On Tue, 19 Mar 2024 at 10:08, Tobias Huschle wrote: >>> >>> On 2024-03-18 15:45, Luis Machado wrote: >>>> On 3/14/24 13:45, Tobias Huschle wrote: >>>>> On Fri, Mar 08, 2024 at 03:11:38PM +, Luis Machado wrote: >>>>>> On 2/28/24 16:10, Tobias Huschle wrote: >>>>>>> >>>>>>> Questions: >>>>>>> 1. The kworker getting its negative lag occurs in the following >>>>>>> scenario >>>>>>>- kworker and a cgroup are supposed to execute on the same CPU >>>>>>>- one task within the cgroup is executing and wakes up the >>>>>>> kworker >>>>>>>- kworker with 0 lag, gets picked immediately and finishes its >>>>>>> execution within ~5000ns >>>>>>>- on dequeue, kworker gets assigned a negative lag >>>>>>>Is this expected behavior? With this short execution time, I >>>>>>> would >>>>>>>expect the kworker to be fine. >>>>>> >>>>>> That strikes me as a bit odd as well. Have you been able to determine >>>>>> how a negative lag >>>>>> is assigned to the kworker after such a short runtime? >>>>>> >>>>> >>>>> I did some more trace reading though and found something. >>>>> >>>>> What I observed if everything runs regularly: >>>>> - vhost and kworker run alternating on the same CPU >>>>> - if the kworker is done, it leaves the runqueue >>>>> - vhost wakes up the kworker if it needs it >>>>> --> this means: >>>>> - vhost starts alone on an otherwise empty runqueue >>>>> - it seems like it never gets dequeued >>>>> (unless another unrelated task joins or migration hits) >>>>> - if vhost wakes up the kworker, the kworker gets selected >>>>> - vhost runtime > kworker runtime >>>>> --> kworker gets positive lag and gets selected immediately next >>>>> time >>>>> >>>>> What happens if it does go wrong: >>>>> From what I gather, there seem to be occasions where the vhost either >>>>> executes suprisingly quick, or the kworker surprinsingly slow. If >>>>> these >>>>> outliers reach critical values, it can happen, that >>>>>vhost runtime < kworker runtime >>>>> which now causes the kworker to get the negative lag. >>>>> >>>>> In this case it seems like, that the vhost is very fast in waking up >>>>> the kworker. And coincidentally, the kworker takes, more time than >>>>> usual >>>>> to finish. We speak of 4-digit to low 5-digit nanoseconds. >>>>> >>>>> So, for these outliers, the scheduler extrapolates that the kworker >>>>> out-consumes the vhost and should be slowed down, although in the >>>>> majority >>>>> of other cases this does not happen. >>>> >>>> Thanks for providing the above details Tobias. It does seem like EEVDF >>>> is strict >>>> about the eligibility checks and making tasks wait when their lags are >>>> negative, even >>>> if just a little bit as in the case of the kworker. >>>> >>>> There was a patch to disable the eligibility checks >>>> (https://lore.kernel.org/lkml/20231013030213.2472697-1-youssefes...@chromium.org/), >>>> which would make EEVDF more like EVDF, though the deadline comparison >>>> would >>>> probably still favor the vhost task instead of the kworker with the >>>> negative lag. >>>> >>>> I'm not sure if you tried it, but I thought I'd mention it. >>> >>> Haven't seen that one yet. Unfortunately, it does not help to ignore the >>> eligibility. >>> >>> I'm inclined to rather propose propose a documentation change, which >>> describes that tasks should not rely on woken up tasks being scheduled >>> immediately. >> >> Where do you see such an assumption ? Even before eevdf, there were >> nothing that ensures such behavior. When using CFS (legacy or eevdf) >> tasks, you can't know if the newly wakeup task will run 1st or not >> > > There was no guarantee of course. place_entity was reducing the vruntime of > woken up tasks though, giving it a slight boost, right?. For the scenario > that I observed, that boost was enough to make sure, that the woken up tasks > gets scheduled consistently. This might still not be true for all scenarios, > but in general EEVDF seems to be stricter with woken up tasks. It seems that way, as EEVDF will do eligibility and deadline checks before scheduling a task, so a task would have to satisfy both of those checks. I think we have some special treatment for when a task initially joins the competition, in which case we halve its slice. But I don't think there is any special treatment for woken tasks anymore. There was also a fix (63304558ba5dcaaff9e052ee43cfdcc7f9c29e85) to try to reduce the number of wake up preemptions under some conditions, under the RUN_TO_PARITY feature.
Re: [RFC] sched/eevdf: sched feature to dismiss lag on wakeup
On 3/14/24 13:45, Tobias Huschle wrote: > On Fri, Mar 08, 2024 at 03:11:38PM +0000, Luis Machado wrote: >> On 2/28/24 16:10, Tobias Huschle wrote: >>> >>> Questions: >>> 1. The kworker getting its negative lag occurs in the following scenario >>>- kworker and a cgroup are supposed to execute on the same CPU >>>- one task within the cgroup is executing and wakes up the kworker >>>- kworker with 0 lag, gets picked immediately and finishes its >>> execution within ~5000ns >>>- on dequeue, kworker gets assigned a negative lag >>>Is this expected behavior? With this short execution time, I would >>>expect the kworker to be fine. >> >> That strikes me as a bit odd as well. Have you been able to determine how a >> negative lag >> is assigned to the kworker after such a short runtime? >> > > I did some more trace reading though and found something. > > What I observed if everything runs regularly: > - vhost and kworker run alternating on the same CPU > - if the kworker is done, it leaves the runqueue > - vhost wakes up the kworker if it needs it > --> this means: > - vhost starts alone on an otherwise empty runqueue > - it seems like it never gets dequeued > (unless another unrelated task joins or migration hits) > - if vhost wakes up the kworker, the kworker gets selected > - vhost runtime > kworker runtime > --> kworker gets positive lag and gets selected immediately next time > > What happens if it does go wrong: > From what I gather, there seem to be occasions where the vhost either > executes suprisingly quick, or the kworker surprinsingly slow. If these > outliers reach critical values, it can happen, that >vhost runtime < kworker runtime > which now causes the kworker to get the negative lag. > > In this case it seems like, that the vhost is very fast in waking up > the kworker. And coincidentally, the kworker takes, more time than usual > to finish. We speak of 4-digit to low 5-digit nanoseconds. > > So, for these outliers, the scheduler extrapolates that the kworker > out-consumes the vhost and should be slowed down, although in the majority > of other cases this does not happen. Thanks for providing the above details Tobias. It does seem like EEVDF is strict about the eligibility checks and making tasks wait when their lags are negative, even if just a little bit as in the case of the kworker. There was a patch to disable the eligibility checks (https://lore.kernel.org/lkml/20231013030213.2472697-1-youssefes...@chromium.org/), which would make EEVDF more like EVDF, though the deadline comparison would probably still favor the vhost task instead of the kworker with the negative lag. I'm not sure if you tried it, but I thought I'd mention it.
Re: [RFC] sched/eevdf: sched feature to dismiss lag on wakeup
Hi Tobias, On 2/28/24 16:10, Tobias Huschle wrote: > The previously used CFS scheduler gave tasks that were woken up an > enhanced chance to see runtime immediately by deducting a certain value > from its vruntime on runqueue placement during wakeup. > > This property was used by some, at least vhost, to ensure, that certain > kworkers are scheduled immediately after being woken up. The EEVDF > scheduler, does not support this so far. Instead, if such a woken up > entitiy carries a negative lag from its previous execution, it will have > to wait for the current time slice to finish, which affects the > performance of the process expecting the immediate execution negatively. > > To address this issue, implement EEVDF strategy #2 for rejoining > entities, which dismisses the lag from previous execution and allows > the woken up task to run immediately (if no other entities are deemed > to be preferred for scheduling by EEVDF). > > The vruntime is decremented by an additional value of 1 to make sure, > that the woken up tasks gets to actually run. This is of course not > following strategy #2 in an exact manner but guarantees the expected > behavior for the scenario described above. Without the additional > decrement, the performance goes south even more. So there are some > side effects I could not get my head around yet. > > Questions: > 1. The kworker getting its negative lag occurs in the following scenario >- kworker and a cgroup are supposed to execute on the same CPU >- one task within the cgroup is executing and wakes up the kworker >- kworker with 0 lag, gets picked immediately and finishes its > execution within ~5000ns >- on dequeue, kworker gets assigned a negative lag >Is this expected behavior? With this short execution time, I would >expect the kworker to be fine. That strikes me as a bit odd as well. Have you been able to determine how a negative lag is assigned to the kworker after such a short runtime? I was looking at a different thread (https://lore.kernel.org/lkml/20240226082349.302363-1-yu.c.c...@intel.com/) that uncovers a potential overflow in the eligibility calculation. Though I don't think that is the case for this particular vhost problem.
Re: [Patch 0/6] [Patch 0/6] PPC64-HWBKPT: Hardware Breakpoint interfaces - ver VIII
Hi, On Fri, 2009-07-31 at 16:10 +1000, David Gibson wrote: On Mon, Jul 27, 2009 at 05:41:52AM +0530, K.Prasad wrote: Hi David, I'm back with a new version of patches after a brief hiatus! After much deliberation about modifying the code to change the timing of signal delivery to user-space, it has been decided to retain the existing behaviour i.e. SIGTRAP delivered to user-space after execution of causative instruction although exception is raised before execution of it. Ok. Except, presumably for ptrace, since changing that would break gdb. Yes. GDB works with some assumptions in mind. Hardware breakpoints (they're not supported right now) trigger before execution, then GDB just removes them and stepi's until we're past the breakpoint. As for HW watchpoints, we always stop before execution (by ppc design presumably). GDB will check the value before the trigger, remove the HW watch, stepi it, and will check the value again to see if it has changed. If we're to trigger after the data load/store has happened, we could have breakage in GDB. Regards, Luis ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc: clean up the Book-E HW watchpoint support
Works for me. I presume you had positive results on the Book-E as well. By the way, thanks for cleaning it up. Luis On Fri, 2008-07-25 at 14:27 -0500, Kumar Gala wrote: * CONFIG_BOOKE is selected by CONFIG_44x so we dont need both * Fixed a few comments * Go back to only using DBCR0_IDM to determine if we are using debug resources. Signed-off-by: Kumar Gala [EMAIL PROTECTED] --- Luis, can you test this on 44x. I don't expect any breakage. - k arch/powerpc/kernel/entry_32.S |6 +++--- arch/powerpc/kernel/process.c |8 arch/powerpc/kernel/ptrace.c |7 --- arch/powerpc/kernel/signal.c |2 +- 4 files changed, 12 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 81c8324..da52269 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -148,7 +148,7 @@ transfer_to_handler: /* Check to see if the dbcr0 register is set up to debug. Use the internal debug mode bit to do this. */ lwz r12,THREAD_DBCR0(r12) - andis. r12,r12,(DBCR0_IDM | DBSR_DAC1R | DBSR_DAC1W)@h + andis. r12,r12,[EMAIL PROTECTED] beq+3f /* From user and task is ptraced - load up global dbcr0 */ li r12,-1 /* clear all pending debug events */ @@ -292,7 +292,7 @@ syscall_exit_cont: /* If the process has its own DBCR0 value, load it up. The internal debug mode bit tells us that dbcr0 should be loaded. */ lwz r0,THREAD+THREAD_DBCR0(r2) - andis. r10,r0,(DBCR0_IDM | DBSR_DAC1R | DBSR_DAC1W)@h + andis. r10,r0,[EMAIL PROTECTED] bnel- load_dbcr0 #endif #ifdef CONFIG_44x @@ -720,7 +720,7 @@ restore_user: /* Check whether this process has its own DBCR0 value. The internal debug mode bit tells us that dbcr0 should be loaded. */ lwz r0,THREAD+THREAD_DBCR0(r2) - andis. r10,r0,(DBCR0_IDM | DBSR_DAC1R | DBSR_DAC1W)@h + andis. r10,r0,[EMAIL PROTECTED] bnel- load_dbcr0 #endif diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index db2497c..e030f3b 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -254,7 +254,7 @@ void do_dabr(struct pt_regs *regs, unsigned long address, return; /* Clear the DAC and struct entries. One shot trigger */ -#if (defined(CONFIG_44x) || defined(CONFIG_BOOKE)) +#if defined(CONFIG_BOOKE) mtspr(SPRN_DBCR0, mfspr(SPRN_DBCR0) ~(DBSR_DAC1R | DBSR_DAC1W | DBCR0_IDM)); #endif @@ -286,7 +286,7 @@ int set_dabr(unsigned long dabr) mtspr(SPRN_DABR, dabr); #endif -#if defined(CONFIG_44x) || defined(CONFIG_BOOKE) +#if defined(CONFIG_BOOKE) mtspr(SPRN_DAC1, dabr); #endif @@ -373,7 +373,7 @@ struct task_struct *__switch_to(struct task_struct *prev, if (unlikely(__get_cpu_var(current_dabr) != new-thread.dabr)) set_dabr(new-thread.dabr); -#if defined(CONFIG_44x) || defined(CONFIG_BOOKE) +#if defined(CONFIG_BOOKE) /* If new thread DAC (HW breakpoint) is the same then leave it */ if (new-thread.dabr) set_dabr(new-thread.dabr); @@ -568,7 +568,7 @@ void flush_thread(void) current-thread.dabr = 0; set_dabr(0); -#if defined(CONFIG_44x) || defined(CONFIG_BOOKE) +#if defined(CONFIG_BOOKE) current-thread.dbcr0 = ~(DBSR_DAC1R | DBSR_DAC1W); #endif } diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c index a5d0e78..66204cb 100644 --- a/arch/powerpc/kernel/ptrace.c +++ b/arch/powerpc/kernel/ptrace.c @@ -717,7 +717,7 @@ void user_disable_single_step(struct task_struct *task) struct pt_regs *regs = task-thread.regs; -#if defined(CONFIG_44x) || defined(CONFIG_BOOKE) +#if defined(CONFIG_BOOKE) /* If DAC then do not single step, skip */ if (task-thread.dabr) return; @@ -744,10 +744,11 @@ int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, if (addr 0) return -EINVAL; + /* The bottom 3 bits in dabr are flags */ if ((data ~0x7UL) = TASK_SIZE) return -EIO; -#ifdef CONFIG_PPC64 +#ifndef CONFIG_BOOKE /* For processors using DABR (i.e. 970), the bottom 3 bits are flags. * It was assumed, on previous implementations, that 3 bits were @@ -769,7 +770,7 @@ int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, task-thread.dabr = data; #endif -#if defined(CONFIG_44x) || defined(CONFIG_BOOKE) +#if defined(CONFIG_BOOKE) /* As described above, it was assumed 3 bits were passed with the data * address, but we will assume only the mode bits will be passed diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c index
Re: [RFC] 4xx hardware watchpoint support
Some comment, first the above negate conditional looks rather ugly, I'd rather do a #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE) dbcr0 case #else dabr case #endif Yes, it makes sense. I'll switch it around. second I wonder why we have the notify_die only for one case, that seems rather odd. Looking further the notify_die is even more odd because DIE_DABR_MATCH doesn't actually appear anywhere else in the kernel. I'd suggest simply removing it. DIE_DABR_MATCH doesn't appear anywhere else because there is only a single function responsible for handling the DABR/DAC events on powerPC with this modification. It would make sense to call this to both the DAC/DABR cases though (i.e. taking it out of the #ifdef), what do you think? Can you redo this in the normal Linux comment style, ala: /* * For processors using DABR (i.e. 970), the bottom 3 bits are flags. * It was assumed, on previous implementations, that 3 bits were * passed together with the data address, fitting the design of the * DABR register, as follows: * * bit 0: Read flag * bit 1: Write flag * bit 2: Breakpoint translation * * Thus, we use them here as so. */ and similar in few other places. Will do, thanks for reviewing this one. Regards, Luis ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC] 4xx hardware watchpoint support
On Wed, 2008-07-23 at 11:53 -0400, Josh Boyer wrote: Shouldn't this (and other places) be: #if defined(CONFIG_44x) || defined(CONFIG_BOOKE) if you are going to exclude 40x for now? Otherwise this is still enabled on 405 and setting the wrong register. josh Yes, sorry. I wasn't aware of this specific define value. It makes things easier to support 405's later. Like so? This addresses Christoph's comments as well. Signed-off-by: Luis Machado [EMAIL PROTECTED] Index: linux-2.6.26/arch/powerpc/kernel/entry_32.S === --- linux-2.6.26.orig/arch/powerpc/kernel/entry_32.S2008-07-23 07:44:57.0 -0700 +++ linux-2.6.26/arch/powerpc/kernel/entry_32.S 2008-07-23 07:50:31.0 -0700 @@ -148,7 +148,7 @@ /* Check to see if the dbcr0 register is set up to debug. Use the internal debug mode bit to do this. */ lwz r12,THREAD_DBCR0(r12) - andis. r12,r12,[EMAIL PROTECTED] + andis. r12,r12,(DBCR0_IDM | DBSR_DAC1R | DBSR_DAC1W)@h beq+3f /* From user and task is ptraced - load up global dbcr0 */ li r12,-1 /* clear all pending debug events */ @@ -292,7 +292,7 @@ /* If the process has its own DBCR0 value, load it up. The internal debug mode bit tells us that dbcr0 should be loaded. */ lwz r0,THREAD+THREAD_DBCR0(r2) - andis. r10,r0,[EMAIL PROTECTED] + andis. r10,r0,(DBCR0_IDM | DBSR_DAC1R | DBSR_DAC1W)@h bnel- load_dbcr0 #endif #ifdef CONFIG_44x @@ -720,7 +720,7 @@ /* Check whether this process has its own DBCR0 value. The internal debug mode bit tells us that dbcr0 should be loaded. */ lwz r0,THREAD+THREAD_DBCR0(r2) - andis. r10,r0,[EMAIL PROTECTED] + andis. r10,r0,(DBCR0_IDM | DBSR_DAC1R | DBSR_DAC1W)@h bnel- load_dbcr0 #endif Index: linux-2.6.26/arch/powerpc/kernel/process.c === --- linux-2.6.26.orig/arch/powerpc/kernel/process.c 2008-07-23 07:44:57.0 -0700 +++ linux-2.6.26/arch/powerpc/kernel/process.c 2008-07-23 07:50:31.0 -0700 @@ -47,6 +47,8 @@ #ifdef CONFIG_PPC64 #include asm/firmware.h #endif +#include linux/kprobes.h +#include linux/kdebug.h extern unsigned long _get_SP(void); @@ -239,6 +241,35 @@ } #endif /* CONFIG_SMP */ +void do_dabr(struct pt_regs *regs, unsigned long address, + unsigned long error_code) +{ + siginfo_t info; + + if (notify_die(DIE_DABR_MATCH, dabr_match, regs, error_code, + 11, SIGSEGV) == NOTIFY_STOP) + return; + + if (debugger_dabr_match(regs)) + return; + + /* Clear the DAC and struct entries. One shot trigger */ +#if (defined(CONFIG_44x) || defined(CONFIG_BOOKE)) + mtspr(SPRN_DBCR0, mfspr(SPRN_DBCR0) ~(DBSR_DAC1R | DBSR_DAC1W + | DBCR0_IDM)); +#endif + + /* Clear the DABR */ + set_dabr(0); + + /* Deliver the signal to userspace */ + info.si_signo = SIGTRAP; + info.si_errno = 0; + info.si_code = TRAP_HWBKPT; + info.si_addr = (void __user *)address; + force_sig_info(SIGTRAP, info, current); +} + static DEFINE_PER_CPU(unsigned long, current_dabr); int set_dabr(unsigned long dabr) @@ -254,6 +285,11 @@ #if defined(CONFIG_PPC64) || defined(CONFIG_6xx) mtspr(SPRN_DABR, dabr); #endif + +#if defined(CONFIG_44x) || defined(CONFIG_BOOKE) + mtspr(SPRN_DAC1, dabr); +#endif + return 0; } @@ -337,6 +373,12 @@ if (unlikely(__get_cpu_var(current_dabr) != new-thread.dabr)) set_dabr(new-thread.dabr); +#if defined(CONFIG_44x) || defined(CONFIG_BOOKE) + /* If new thread DAC (HW breakpoint) is the same then leave it */ + if (new-thread.dabr) + set_dabr(new-thread.dabr); +#endif + new_thread = new-thread; old_thread = current-thread; @@ -525,6 +567,10 @@ if (current-thread.dabr) { current-thread.dabr = 0; set_dabr(0); + +#if defined(CONFIG_44x) || defined(CONFIG_BOOKE) + current-thread.dbcr0 = ~(DBSR_DAC1R | DBSR_DAC1W); +#endif } } Index: linux-2.6.26/arch/powerpc/kernel/ptrace.c === --- linux-2.6.26.orig/arch/powerpc/kernel/ptrace.c 2008-07-23 07:44:57.0 -0700 +++ linux-2.6.26/arch/powerpc/kernel/ptrace.c 2008-07-23 07:53:45.0 -0700 @@ -703,7 +703,7 @@ if (regs != NULL) { #if defined(CONFIG_40x) || defined(CONFIG_BOOKE) - task-thread.dbcr0 = DBCR0_IDM | DBCR0_IC; + task-thread.dbcr0 |= DBCR0_IDM | DBCR0_IC; regs-msr |= MSR_DE; #else regs-msr |= MSR_SE; @@ -716,9 +716,16 @@ { struct
Re: [RFC] 4xx hardware watchpoint support
Hi, That, or adding a small function to move the bits to the appropriate registers (set_dbcr or set_dac_events). Do you think it's worth to support this facility on 405's processors? If so, i'll gladly work on a solution to it. I would think so. There's really no difference from a userspace perspective, so gdb watchpoints could be valuable there too. I'll leave it up to you though. As the 440 support is ready and the 405 needs additional tweaking due to the use of DBCR1 instead of DBCR0 and due to a different position scheme of the DAC1R/DAC1W flags inside DBCR1, i'd say we should include this code and handle the 405 case later. We might have to handle it anyway if we're going to pursue the hardware breakpoint interface work in the future. I've fixed some formatting problems. Tested on a 440 Taishan board and on a PPC970. Both worked as they should. Ok? Signed-off-by: Luis Machado [EMAIL PROTECTED] Index: linux-2.6.26/arch/powerpc/kernel/process.c === --- linux-2.6.26.orig/arch/powerpc/kernel/process.c 2008-07-20 16:56:57.0 -0700 +++ linux-2.6.26/arch/powerpc/kernel/process.c 2008-07-22 16:46:36.0 -0700 @@ -47,6 +47,8 @@ #ifdef CONFIG_PPC64 #include asm/firmware.h #endif +#include linux/kprobes.h +#include linux/kdebug.h extern unsigned long _get_SP(void); @@ -239,6 +241,36 @@ } #endif /* CONFIG_SMP */ +void do_dabr(struct pt_regs *regs, unsigned long address, + unsigned long error_code) +{ + siginfo_t info; + +#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE)) + if (notify_die(DIE_DABR_MATCH, dabr_match, regs, error_code, + 11, SIGSEGV) == NOTIFY_STOP) + return; + + if (debugger_dabr_match(regs)) + return; + + /* Clear the DAC and struct entries. One shot trigger. */ +#else /* (defined(CONFIG_4xx) || defined(CONFIG_BOOKE)) */ + mtspr(SPRN_DBCR0, mfspr(SPRN_DBCR0) ~(DBSR_DAC1R | DBSR_DAC1W + | DBCR0_IDM)); +#endif + + /* Clear the DABR */ + set_dabr(0); + + /* Deliver the signal to userspace */ + info.si_signo = SIGTRAP; + info.si_errno = 0; + info.si_code = TRAP_HWBKPT; + info.si_addr = (void __user *)address; + force_sig_info(SIGTRAP, info, current); +} + static DEFINE_PER_CPU(unsigned long, current_dabr); int set_dabr(unsigned long dabr) @@ -254,6 +286,11 @@ #if defined(CONFIG_PPC64) || defined(CONFIG_6xx) mtspr(SPRN_DABR, dabr); #endif + +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + mtspr(SPRN_DAC1, dabr); +#endif + return 0; } @@ -337,6 +374,12 @@ if (unlikely(__get_cpu_var(current_dabr) != new-thread.dabr)) set_dabr(new-thread.dabr); +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + /* If new thread DAC (HW breakpoint) is the same then leave it. */ + if (new-thread.dabr) + set_dabr(new-thread.dabr); +#endif + new_thread = new-thread; old_thread = current-thread; @@ -525,6 +568,10 @@ if (current-thread.dabr) { current-thread.dabr = 0; set_dabr(0); + +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + current-thread.dbcr0 = ~(DBSR_DAC1R | DBSR_DAC1W); +#endif } } Index: linux-2.6.26/arch/powerpc/kernel/ptrace.c === --- linux-2.6.26.orig/arch/powerpc/kernel/ptrace.c 2008-07-20 16:56:57.0 -0700 +++ linux-2.6.26/arch/powerpc/kernel/ptrace.c 2008-07-22 16:41:24.0 -0700 @@ -703,7 +703,7 @@ if (regs != NULL) { #if defined(CONFIG_40x) || defined(CONFIG_BOOKE) - task-thread.dbcr0 = DBCR0_IDM | DBCR0_IC; + task-thread.dbcr0 |= DBCR0_IDM | DBCR0_IC; regs-msr |= MSR_DE; #else regs-msr |= MSR_SE; @@ -716,9 +716,16 @@ { struct pt_regs *regs = task-thread.regs; + +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + /* If DAC then do not single step, skip. */ + if (task-thread.dabr) + return; +#endif + if (regs != NULL) { #if defined(CONFIG_40x) || defined(CONFIG_BOOKE) - task-thread.dbcr0 = 0; + task-thread.dbcr0 = ~(DBCR0_IC | DBCR0_IDM); regs-msr = ~MSR_DE; #else regs-msr = ~MSR_SE; @@ -727,22 +734,71 @@ clear_tsk_thread_flag(task, TIF_SINGLESTEP); } -static int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, +int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, unsigned long data) { - /* We only support one DABR and no IABRS at the moment */ + /* For ppc64 we support one DABR and no IABR's at the moment (ppc64). + For embedded processors we support one DAC
Re: [RFC] 4xx hardware watchpoint support
This doesn't look right for how it's coded. This would be the CONFIG_4xx || CONFIG_BOOKE case, but CONFIG_4xx includes PowerPC 405. That has a different bit layout among the DBCR registers. Namely, on 405 you would be clearing the TDE and IAC1 events because the DAC events are in DBCR1, not DBCR0. Maybe guarding the 405-specific parts in a separate #if defined(CONFIG_40x) block will do? Do you think it's worth to support this facility on 405's processors? If so, i'll gladly work on a solution to it. Regards, Luis ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC] 4xx hardware watchpoint support
Hi guys, Did anyone have a chance to go over this patch? Looking forward to receive feedbacks on it. Thanks! Regards, Luis On Fri, 2008-06-20 at 17:14 -0300, Luis Machado wrote: On Thu, 2008-05-22 at 13:51 +1000, Paul Mackerras wrote: Luis Machado writes: This is a patch that has been sitting idle for quite some time. I decided to move it further because it is something useful. It was originally written by Michel Darneille, based off of 2.6.16. The original patch, though, was not compatible with the current DABR logic. DABR's are used to implement hardware watchpoint support for ppc64 processors (i.e. 970, Power5 etc). 4xx's have a different debugging register layout and needs to be handled differently (they two registers: DAC and DBCR0). Yes, they are different, but they do essentially the same thing, so I think we should try and unify the handling of the two. Maybe you could rename set_dabr() to set_hw_watchpoint() or something and make it set DABR on 64-bit and classic 32-bit processors, and DAC on 4xx/Book E processors. Likewise, I don't think we need both a dabr field and a dac field in the thread_struct - one should do. Rename dabr to something else if you feel that the dabr name is too ppc64-specific. And I don't think we need both __debugger_dabr_match and __debugger_dac_match. 5) This is something i'm worried about for future features. We currently have a way to support only Hardware Watchpoints, but not Hardware Breakpoints (on 64-bit processors that have a IABR register or 32-bit processors carrying the IAC register). Looking at the code, we don't differentiate a watchpoint from a breakpoint request. A ptrace call has currently 3 arguments: REQUEST, ADDR and DATA. We use REQUEST and DATA to set a hardware watchpoint. Maybe we could use the ADDR parameter to set a hardware breakpoint? Or use it to tell the kernel whether we want a hardware watchpoint or hardware breakpoint and then pass the address of the instruction/data through the DATA parameter? What do you think? I would think there would be a different REQUEST value to mean set a hardware breakpoint. Roland McGrath (cc'd) might be able to tell us what other architectures do. Paul. Paul, Follows a re-worked patch that unifies the handling of hardware watchpoint structures for DABR-based and DAC-based processors. The flow is exactly the same for DABR-based processors. As for the DAC-based code, i've eliminated the dac field from the thread structure, since now we use the dabr field to represent DAC1 of the DAC-based processors. As a consequence, i removed all the occurrences of dac throughout the code, using dabr in those cases. The function set_dabr sets the correct register (DABR OR DAC) for each specific processor now, instead of only setting the value for DABR-based processors. do_dac was merged with do_dabr as they were mostly equivalent pieces of code. I've moved do_dabr to arch/powerpc/kernel/process.c so it is visible for DAC-based code as well. Tested on a Taishan 440GX with success. Is it OK to leave it as dabr for both DABR and DAC? What do you think of the patch overall? Thanks, Best regards, Luis Index: linux-2.6.25-oldpatch/arch/powerpc/kernel/process.c === --- linux-2.6.25-oldpatch.orig/arch/powerpc/kernel/process.c 2008-06-20 02:48:19.0 -0700 +++ linux-2.6.25-oldpatch/arch/powerpc/kernel/process.c 2008-06-20 02:51:16.0 -0700 @@ -241,6 +241,35 @@ } #endif /* CONFIG_SMP */ +void do_dabr(struct pt_regs *regs, unsigned long address, + unsigned long error_code) +{ + siginfo_t info; + +#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE)) + if (notify_die(DIE_DABR_MATCH, dabr_match, regs, error_code, + 11, SIGSEGV) == NOTIFY_STOP) + return; + + if (debugger_dabr_match(regs)) + return; +#else +/* Clear the DAC and struct entries. One shot trigger. */ +mtspr(SPRN_DBCR0, mfspr(SPRN_DBCR0) ~(DBSR_DAC1R | DBSR_DAC1W +| DBCR0_IDM)); +#endif + + /* Clear the DABR */ + set_dabr(0); + + /* Deliver the signal to userspace */ + info.si_signo = SIGTRAP; + info.si_errno = 0; + info.si_code = TRAP_HWBKPT; + info.si_addr = (void __user *)address; + force_sig_info(SIGTRAP, info, current); +} + static DEFINE_PER_CPU(unsigned long, current_dabr); int set_dabr(unsigned long dabr) @@ -256,6 +285,11 @@ #if defined(CONFIG_PPC64) || defined(CONFIG_6xx) mtspr(SPRN_DABR, dabr); #endif + +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + mtspr(SPRN_DAC1, dabr); +#endif + return 0; } @@ -330,6 +364,12 @@ if (unlikely(__get_cpu_var(current_dabr) != new
Re: [RFC] 4xx hardware watchpoint support
On Thu, 2008-05-22 at 13:51 +1000, Paul Mackerras wrote: Luis Machado writes: This is a patch that has been sitting idle for quite some time. I decided to move it further because it is something useful. It was originally written by Michel Darneille, based off of 2.6.16. The original patch, though, was not compatible with the current DABR logic. DABR's are used to implement hardware watchpoint support for ppc64 processors (i.e. 970, Power5 etc). 4xx's have a different debugging register layout and needs to be handled differently (they two registers: DAC and DBCR0). Yes, they are different, but they do essentially the same thing, so I think we should try and unify the handling of the two. Maybe you could rename set_dabr() to set_hw_watchpoint() or something and make it set DABR on 64-bit and classic 32-bit processors, and DAC on 4xx/Book E processors. Likewise, I don't think we need both a dabr field and a dac field in the thread_struct - one should do. Rename dabr to something else if you feel that the dabr name is too ppc64-specific. And I don't think we need both __debugger_dabr_match and __debugger_dac_match. 5) This is something i'm worried about for future features. We currently have a way to support only Hardware Watchpoints, but not Hardware Breakpoints (on 64-bit processors that have a IABR register or 32-bit processors carrying the IAC register). Looking at the code, we don't differentiate a watchpoint from a breakpoint request. A ptrace call has currently 3 arguments: REQUEST, ADDR and DATA. We use REQUEST and DATA to set a hardware watchpoint. Maybe we could use the ADDR parameter to set a hardware breakpoint? Or use it to tell the kernel whether we want a hardware watchpoint or hardware breakpoint and then pass the address of the instruction/data through the DATA parameter? What do you think? I would think there would be a different REQUEST value to mean set a hardware breakpoint. Roland McGrath (cc'd) might be able to tell us what other architectures do. Paul. Paul, Follows a re-worked patch that unifies the handling of hardware watchpoint structures for DABR-based and DAC-based processors. The flow is exactly the same for DABR-based processors. As for the DAC-based code, i've eliminated the dac field from the thread structure, since now we use the dabr field to represent DAC1 of the DAC-based processors. As a consequence, i removed all the occurrences of dac throughout the code, using dabr in those cases. The function set_dabr sets the correct register (DABR OR DAC) for each specific processor now, instead of only setting the value for DABR-based processors. do_dac was merged with do_dabr as they were mostly equivalent pieces of code. I've moved do_dabr to arch/powerpc/kernel/process.c so it is visible for DAC-based code as well. Tested on a Taishan 440GX with success. Is it OK to leave it as dabr for both DABR and DAC? What do you think of the patch overall? Thanks, Best regards, Luis Index: linux-2.6.25-oldpatch/arch/powerpc/kernel/process.c === --- linux-2.6.25-oldpatch.orig/arch/powerpc/kernel/process.c2008-06-20 02:48:19.0 -0700 +++ linux-2.6.25-oldpatch/arch/powerpc/kernel/process.c 2008-06-20 02:51:16.0 -0700 @@ -241,6 +241,35 @@ } #endif /* CONFIG_SMP */ +void do_dabr(struct pt_regs *regs, unsigned long address, + unsigned long error_code) +{ + siginfo_t info; + +#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE)) + if (notify_die(DIE_DABR_MATCH, dabr_match, regs, error_code, + 11, SIGSEGV) == NOTIFY_STOP) + return; + + if (debugger_dabr_match(regs)) + return; +#else +/* Clear the DAC and struct entries. One shot trigger. */ +mtspr(SPRN_DBCR0, mfspr(SPRN_DBCR0) ~(DBSR_DAC1R | DBSR_DAC1W +| DBCR0_IDM)); +#endif + + /* Clear the DABR */ + set_dabr(0); + + /* Deliver the signal to userspace */ + info.si_signo = SIGTRAP; + info.si_errno = 0; + info.si_code = TRAP_HWBKPT; + info.si_addr = (void __user *)address; + force_sig_info(SIGTRAP, info, current); +} + static DEFINE_PER_CPU(unsigned long, current_dabr); int set_dabr(unsigned long dabr) @@ -256,6 +285,11 @@ #if defined(CONFIG_PPC64) || defined(CONFIG_6xx) mtspr(SPRN_DABR, dabr); #endif + +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + mtspr(SPRN_DAC1, dabr); +#endif + return 0; } @@ -330,6 +364,12 @@ if (unlikely(__get_cpu_var(current_dabr) != new-thread.dabr)) set_dabr(new-thread.dabr); +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + /* If new thread DAC (HW breakpoint) is the same then leave it. */ + if (new-thread.dabr) + set_dabr(new-thread.dabr); +#endif
Re: [PATCH 5/6] ibm_newemac: PowerPC 440GX EMAC PHY clock workaround
Did this show up on 2.6.26 recently? Or we're expecting it to show up a bit later? Thanks, Luis On Tue, 2008-04-22 at 10:46 +1000, Benjamin Herrenschmidt wrote: From: Valentine Barshak [EMAIL PROTECTED] The PowerPC 440GX Taishan board fails to reset EMAC3 (reset timeout error) if there's no link. Because of that it fails to find PHY chip. The older ibm_emac driver had a workaround for that: the EMAC_CLK_INTERNAL/EMAC_CLK_EXTERNAL macros, which toggle the Ethernet Clock Select bit in the SDR0_MFR register. This patch does the same for ibm,emac-440gx compatible chips. The workaround forces clock on -all- EMACs, so we select clock under global emac_phy_map_lock. BenH: Made that #ifdef CONFIG_PPC_DCR_NATIVE for now as dcri_* stuff doesn't exist for MMIO type DCRs like Cell. Some future rework improvements of the DCR infrastructure will make that cleaner but for now, this makes it work. Signed-off-by: Valentine Barshak [EMAIL PROTECTED] Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- drivers/net/ibm_newemac/core.c | 18 +- drivers/net/ibm_newemac/core.h |8 ++-- 2 files changed, 23 insertions(+), 3 deletions(-) --- linux-work.orig/drivers/net/ibm_newemac/core.c2008-04-22 10:11:59.0 +1000 +++ linux-work/drivers/net/ibm_newemac/core.c 2008-04-22 10:31:18.0 +1000 @@ -43,6 +43,8 @@ #include asm/io.h #include asm/dma.h #include asm/uaccess.h +#include asm/dcr.h +#include asm/dcr-regs.h #include core.h @@ -2333,6 +2335,11 @@ static int __devinit emac_init_phy(struc dev-phy.mdio_read = emac_mdio_read; dev-phy.mdio_write = emac_mdio_write; + /* Enable internal clock source */ +#ifdef CONFIG_PPC_DCR_NATIVE + if (emac_has_feature(dev, EMAC_FTR_440GX_PHY_CLK_FIX)) + dcri_clrset(SDR0, SDR0_MFR, 0, SDR0_MFR_ECS); +#endif /* Configure EMAC with defaults so we can at least use MDIO * This is needed mostly for 440GX */ @@ -2365,6 +2372,12 @@ static int __devinit emac_init_phy(struc if (!emac_mii_phy_probe(dev-phy, i)) break; } + + /* Enable external clock source */ +#ifdef CONFIG_PPC_DCR_NATIVE + if (emac_has_feature(dev, EMAC_FTR_440GX_PHY_CLK_FIX)) + dcri_clrset(SDR0, SDR0_MFR, SDR0_MFR_ECS, 0); +#endif mutex_unlock(emac_phy_map_lock); if (i == 0x20) { printk(KERN_WARNING %s: can't find PHY!\n, np-full_name); @@ -2490,8 +2503,11 @@ static int __devinit emac_init_config(st } /* Check EMAC version */ - if (of_device_is_compatible(np, ibm,emac4)) + if (of_device_is_compatible(np, ibm,emac4)) { dev-features |= EMAC_FTR_EMAC4; + if (of_device_is_compatible(np, ibm,emac-440gx)) + dev-features |= EMAC_FTR_440GX_PHY_CLK_FIX; + } /* Fixup some feature bits based on the device tree */ if (of_get_property(np, has-inverted-stacr-oc, NULL)) Index: linux-work/drivers/net/ibm_newemac/core.h === --- linux-work.orig/drivers/net/ibm_newemac/core.h2008-04-22 10:06:31.0 +1000 +++ linux-work/drivers/net/ibm_newemac/core.h 2008-04-22 10:30:03.0 +1000 @@ -301,6 +301,10 @@ struct emac_instance { * Set if we have new type STACR with STAOPC */ #define EMAC_FTR_HAS_NEW_STACR 0x0040 +/* + * Set if we need phy clock workaround for 440gx + */ +#define EMAC_FTR_440GX_PHY_CLK_FIX 0x0080 /* Right now, we don't quite handle the always/possible masks on the @@ -312,8 +316,8 @@ enum { EMAC_FTRS_POSSIBLE = #ifdef CONFIG_IBM_NEW_EMAC_EMAC4 - EMAC_FTR_EMAC4 | EMAC_FTR_HAS_NEW_STACR| - EMAC_FTR_STACR_OC_INVERT| + EMAC_FTR_EMAC4 | EMAC_FTR_HAS_NEW_STACR | + EMAC_FTR_STACR_OC_INVERT | EMAC_FTR_440GX_PHY_CLK_FIX | #endif #ifdef CONFIG_IBM_NEW_EMAC_TAH EMAC_FTR_HAS_TAH| ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev -- Luis Machado Software Engineer IBM Linux Technology Center ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC] 4xx hardware watchpoint support
On Thu, 2008-05-22 at 13:51 +1000, Paul Mackerras wrote: Luis Machado writes: This is a patch that has been sitting idle for quite some time. I decided to move it further because it is something useful. It was originally written by Michel Darneille, based off of 2.6.16. The original patch, though, was not compatible with the current DABR logic. DABR's are used to implement hardware watchpoint support for ppc64 processors (i.e. 970, Power5 etc). 4xx's have a different debugging register layout and needs to be handled differently (they two registers: DAC and DBCR0). Yes, they are different, but they do essentially the same thing, so I think we should try and unify the handling of the two. Maybe you could rename set_dabr() to set_hw_watchpoint() or something and make it set DABR on 64-bit and classic 32-bit processors, and DAC on 4xx/Book E processors. Likewise, I don't think we need both a dabr field and a dac field in the thread_struct - one should do. Rename dabr to something else if you feel that the dabr name is too ppc64-specific. And I don't think we need both __debugger_dabr_match and __debugger_dac_match. Thanks for the feedback Paul. I'll try consolidating those mechanisms into a single, more general scheme. Best regards, Luis ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC] 4xx hardware watchpoint support
On Wed, 2008-05-21 at 23:46 -0700, Roland McGrath wrote: I would think there would be a different REQUEST value to mean set a hardware breakpoint. Roland McGrath (cc'd) might be able to tell us what other architectures do. Other architectures don't give a good model to follow. (If anything, they just trivally virtualize their own idiosyncratic hardware.) What I want to see done for this in the future is reviving and finishing the hw_breakpoint work begun by Alan Stern, and porting that to each arch's particular hardware features. On that we'd build any new interfaces in abstract machine-independent terms, just describing the constraints of what the hardware can do, rather than having the user interface involve mimicking hardware encodings. (The existing hardware-idiosyncratic ptrace interfaces would tie into hw_breakpoint for backward compatibility.) Kumar was just mentioning this post a few messages ago: http://ozlabs.org/pipermail/linuxppc-dev/2008-May/055745.html That is a very interesting approach to handle all the differences between each processor's architecture, and a much cleaner way to set the facilities we want than the current interface we have. Do you know what is the status of this work? Did it move any further? Best Regards, Luis ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[RFC] 4xx hardware watchpoint support
Hi, This is a patch that has been sitting idle for quite some time. I decided to move it further because it is something useful. It was originally written by Michel Darneille, based off of 2.6.16. The original patch, though, was not compatible with the current DABR logic. DABR's are used to implement hardware watchpoint support for ppc64 processors (i.e. 970, Power5 etc). 4xx's have a different debugging register layout and needs to be handled differently (they two registers: DAC and DBCR0). I've refreshed the patch to a recent stable release (2.6.25.1, still apllies cleanly on 2.6.25.4), made the patch compatible with both 4xx and ppc64 processor designs, fixed some masks that didn't seem correct (the ones setting hw watch read/write modes) and refactored some of the code. Though, i'm still not happy enough with the patch as i think we could improve it a bit further. Some points i consider worth of attention: 1) There is a do_dac(...) implementation inside arch/powerpc/kernel/traps.c. I don't feel this is correct. I see that the 64-bit counterpart, do_dabr(...), is implemented inside arch/powerpc/mm/fault.c. Due to do_dac(...) being implemented inside traps.c, we need to externalize the declaration for get_dac(...) on include/asm-[powerpc|ppc]/system.h so it's made visible to that scope. We could use mfspr(...) to get the register's contents directly, but then i wouldn't make sense to have get_dac(...) in the first place. Maybe moving the do_dac(...) code to arch/powerpc/mm/fault.c would make more sense since we seem to have the address already, and won't need to call get_dac(...) to get it. 2) The change to make set_debugreg(...) and get_debugreg(...) transparent for both DAC-driven and DABR-driven processors is OK. But that shouldn't require us to externalize the declaration of set_debugreg(...) in order to use it in arch/powerpc/kernel/traps.c right? Maybe this has some relationship with the above point. 3) Maybe it would be better to come up with a way to merge both DABR and DAC/DBCR0 logic in order to get rid of dozens of processor-specific #ifdef's? This could be a bit more complex since it would require re-writing good portions of code. 4) Should i use CONFIG_40x ou CONFIG_4xx instead? Would CONFIG_4xx automatically include 40x's? I'm mainly targetting 4xx's here, though 40x's should be similar except for 403. 5) This is something i'm worried about for future features. We currently have a way to support only Hardware Watchpoints, but not Hardware Breakpoints (on 64-bit processors that have a IABR register or 32-bit processors carrying the IAC register). Looking at the code, we don't differentiate a watchpoint from a breakpoint request. A ptrace call has currently 3 arguments: REQUEST, ADDR and DATA. We use REQUEST and DATA to set a hardware watchpoint. Maybe we could use the ADDR parameter to set a hardware breakpoint? Or use it to tell the kernel whether we want a hardware watchpoint or hardware breakpoint and then pass the address of the instruction/data through the DATA parameter? What do you think? I appreciate any comments about these items and the patch itself. Best regards. Luis Index: linux-2.6.25.1/arch/powerpc/kernel/process.c === --- linux-2.6.25.1.orig/arch/powerpc/kernel/process.c 2008-05-21 07:25:45.0 -0700 +++ linux-2.6.25.1/arch/powerpc/kernel/process.c 2008-05-21 07:26:55.0 -0700 @@ -219,6 +219,28 @@ } #endif /* CONFIG_SPE */ +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + +/* Add support for HW Debug breakpoint. Use DAC register. */ +int set_dac(unsigned long dac) +{ + mtspr(SPRN_DAC1, dac); + + return 0; +} +unsigned int get_dac() +{ + return mfspr(SPRN_DAC1); +} +int set_dbcr0(unsigned long dbcr) +{ + mtspr(SPRN_DBCR0, dbcr); + + return 0; +} + +#endif + #ifndef CONFIG_SMP /* * If we are doing lazy switching of CPU state (FP, altivec or SPE), @@ -330,6 +352,13 @@ if (unlikely(__get_cpu_var(current_dabr) != new-thread.dabr)) set_dabr(new-thread.dabr); + +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + /* If new thread DAC (HW breakpoint) is the same then leave it. */ + if (new-thread.dac) + set_dac(new-thread.dac); +#endif + new_thread = new-thread; old_thread = current-thread; @@ -515,6 +544,16 @@ discard_lazy_cpu_state(); +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + if (current-thread.dac) { + current-thread.dac = 0; + /* Clear debug control. */ + current-thread.dbcr0 = ~(DBSR_DAC1R | DBSR_DAC1W); + + set_dac(0); + } +#endif + if (current-thread.dabr) { current-thread.dabr = 0; set_dabr(0); Index: linux-2.6.25.1/arch/powerpc/kernel/ptrace.c === --- linux-2.6.25.1.orig/arch/powerpc/kernel/ptrace.c 2008-05-21 07:25:45.0 -0700 +++ linux-2.6.25.1/arch/powerpc/kernel/ptrace.c 2008-05-21 08:23:34.0 -0700 @@ -629,9 +629,15 @@ { struct pt_regs *regs =
Re: [RFC] 4xx hardware watchpoint support
Thanks for the inlining tip. It should be now. :-) So, basically we are looking at a cleaner and much better interface to set such hardware features? That's something that would greatly improve the communication from, say, GDB to the kernel regarding these facilities. Regards, Luis On Wed, 2008-05-21 at 16:16 -0500, Kumar Gala wrote: Two real quick notes. Take a look at: http://ozlabs.org/pipermail/linuxppc-dev/2008-May/055745.html and can you try and post the patch inline next time. Hard to provide review comments on it :) - kIndex: linux-2.6.25.1/arch/powerpc/kernel/process.c === --- linux-2.6.25.1.orig/arch/powerpc/kernel/process.c 2008-05-21 07:25:45.0 -0700 +++ linux-2.6.25.1/arch/powerpc/kernel/process.c2008-05-21 07:26:55.0 -0700 @@ -219,6 +219,28 @@ } #endif /* CONFIG_SPE */ +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + +/* Add support for HW Debug breakpoint. Use DAC register. */ +int set_dac(unsigned long dac) +{ + mtspr(SPRN_DAC1, dac); + + return 0; +} +unsigned int get_dac() +{ + return mfspr(SPRN_DAC1); +} +int set_dbcr0(unsigned long dbcr) +{ + mtspr(SPRN_DBCR0, dbcr); + + return 0; +} + +#endif + #ifndef CONFIG_SMP /* * If we are doing lazy switching of CPU state (FP, altivec or SPE), @@ -330,6 +352,13 @@ if (unlikely(__get_cpu_var(current_dabr) != new-thread.dabr)) set_dabr(new-thread.dabr); + +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + /* If new thread DAC (HW breakpoint) is the same then leave it. */ + if (new-thread.dac) + set_dac(new-thread.dac); +#endif + new_thread = new-thread; old_thread = current-thread; @@ -515,6 +544,16 @@ discard_lazy_cpu_state(); +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + if (current-thread.dac) { + current-thread.dac = 0; + /* Clear debug control. */ + current-thread.dbcr0 = ~(DBSR_DAC1R | DBSR_DAC1W); + + set_dac(0); + } +#endif + if (current-thread.dabr) { current-thread.dabr = 0; set_dabr(0); Index: linux-2.6.25.1/arch/powerpc/kernel/ptrace.c === --- linux-2.6.25.1.orig/arch/powerpc/kernel/ptrace.c2008-05-21 07:25:45.0 -0700 +++ linux-2.6.25.1/arch/powerpc/kernel/ptrace.c 2008-05-21 08:23:34.0 -0700 @@ -629,9 +629,15 @@ { struct pt_regs *regs = task-thread.regs; +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + /* If DAC then do not single step, skip. */ + if (task-thread.dac) + return; +#endif + if (regs != NULL) { #if defined(CONFIG_40x) || defined(CONFIG_BOOKE) - task-thread.dbcr0 = 0; + task-thread.dbcr0 = ~(DBCR0_IC | DBCR0_IDM); regs-msr = ~MSR_DE; #else regs-msr = ~MSR_SE; @@ -640,22 +646,83 @@ clear_tsk_thread_flag(task, TIF_SINGLESTEP); } -static int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, +static int ptrace_get_debugreg(struct task_struct *task, unsigned long data) +{ + int ret; + + #if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + ret = put_user(task-thread.dac, + (unsigned long __user *)data); + #else + ret = put_user(task-thread.dabr, + (unsigned long __user *)data); + #endif + + return ret; +} + +int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, unsigned long data) { - /* We only support one DABR and no IABRS at the moment */ + /* For ppc64 we support one DABR and no IABR's at the moment (ppc64). + For embedded processors we support one DAC and no IAC's + at the moment. */ if (addr 0) return -EINVAL; - /* The bottom 3 bits are flags */ if ((data ~0x7UL) = TASK_SIZE) return -EIO; - /* Ensure translation is on */ +#ifdef CONFIG_PPC64 + + /* For processors using DABR (i.e. 970), the bottom 3 bits are flags. + It was assumed, on previous implementations, that 3 bits were + passed together with the data address, fitting the design of the + DABR register, as follows: + + bit 0: Read flag + bit 1: Write flag + bit 2: Breakpoint translation + + Thus, we use them here as so. */ + + /* Ensure breakpoint translation bit is set. */ if (data !(data DABR_TRANSLATION)) return -EIO; + /* Move contents to the DABR register. */ task-thread.dabr = data; + +#endif +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + + /* Read or Write bits must be set. */ + if (!(data 0x3UL)) +
Re: PPC upstream kernel ignored DABR bug
On Wed, 2008-03-12 at 23:30 +0100, Jens Osterkamp wrote: Just to make sure, i tested the binary against the 2.6.25-rc4 kernel. It still fails. So this is really an open bug for PPC. On a Cell- or 970-based machine ? Gruß, Jens On a 970-based machine. Regards, -- Luis Machado Software Engineer IBM Linux Technology Center ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: PPC upstream kernel ignored DABR bug
Hi, On the Blade DABRX had to be set additional to DABR. PS3 and Celleb already did this. Uli Weigand found this back in November. I submitted a patch for this which went into 2.6.25-rc4. Can you please try again with rc4 ? Gruß, Jens Just to make sure, i tested the binary against the 2.6.25-rc4 kernel. It still fails. So this is really an open bug for PPC. -- Luis Machado Software Engineer IBM Linux Technology Center ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: PPC upstream kernel ignored DABR bug
On the Blade DABRX had to be set additional to DABR. PS3 and Celleb already did this. Uli Weigand found this back in November. I submitted a patch for this which went into 2.6.25-rc4. Can you please try again with rc4 ? I will try it and will post the results back. Thanks Jens. Regards, -- Luis Machado Software Engineer IBM Linux Technology Center ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: PPC upstream kernel ignored DABR bug
On Mon, 2008-03-10 at 12:19 -0700, Roland McGrath wrote: On the Blade DABRX had to be set additional to DABR. PS3 and Celleb already did this. Uli Weigand found this back in November. I submitted a patch for this which went into 2.6.25-rc4. Can you please try again with rc4 ? This is not the problem. This came up before and everyone seems have forgotten. This bug has been reproduced on G5's, which do not have DABRX as I understand it. Yes, now that you mentioned, i've been able to reproduce this on 970FX's blades, which i don't think have DABRX registers. I guess it's the almost the same CPU as G5's. Regards, -- Luis Machado Software Engineer IBM Linux Technology Center ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: PPC upstream kernel ignored DABR bug
Yes, I know. I tried it on the PS3 first and couldn't reproduce the bug he saw on the blade. Arnd, Do we have any news on this topic? I've seen this happening quite often within GDB when using hardware watchpoints on a shared variable in a threaded (7+ threads) binary. Sometimes the watchpoint won't trigger, even though the monitored variable's value was modified. Appreciate your feedback. Best regards, -- Luis Machado LoP Toolchain Software Engineer IBM Linux Technology Center ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev