Re: [RFC] sched/eevdf: sched feature to dismiss lag on wakeup

2024-03-20 Thread Luis Machado
On 3/20/24 07:04, Tobias Huschle wrote:
> On Tue, Mar 19, 2024 at 02:41:14PM +0100, Vincent Guittot wrote:
>> On Tue, 19 Mar 2024 at 10:08, Tobias Huschle  wrote:
>>>
>>> On 2024-03-18 15:45, Luis Machado wrote:
>>>> On 3/14/24 13:45, Tobias Huschle wrote:
>>>>> On Fri, Mar 08, 2024 at 03:11:38PM +, Luis Machado wrote:
>>>>>> On 2/28/24 16:10, Tobias Huschle wrote:
>>>>>>>
>>>>>>> Questions:
>>>>>>> 1. The kworker getting its negative lag occurs in the following
>>>>>>> scenario
>>>>>>>- kworker and a cgroup are supposed to execute on the same CPU
>>>>>>>- one task within the cgroup is executing and wakes up the
>>>>>>> kworker
>>>>>>>- kworker with 0 lag, gets picked immediately and finishes its
>>>>>>>  execution within ~5000ns
>>>>>>>- on dequeue, kworker gets assigned a negative lag
>>>>>>>Is this expected behavior? With this short execution time, I
>>>>>>> would
>>>>>>>expect the kworker to be fine.
>>>>>>
>>>>>> That strikes me as a bit odd as well. Have you been able to determine
>>>>>> how a negative lag
>>>>>> is assigned to the kworker after such a short runtime?
>>>>>>
>>>>>
>>>>> I did some more trace reading though and found something.
>>>>>
>>>>> What I observed if everything runs regularly:
>>>>> - vhost and kworker run alternating on the same CPU
>>>>> - if the kworker is done, it leaves the runqueue
>>>>> - vhost wakes up the kworker if it needs it
>>>>> --> this means:
>>>>>   - vhost starts alone on an otherwise empty runqueue
>>>>>   - it seems like it never gets dequeued
>>>>> (unless another unrelated task joins or migration hits)
>>>>>   - if vhost wakes up the kworker, the kworker gets selected
>>>>>   - vhost runtime > kworker runtime
>>>>> --> kworker gets positive lag and gets selected immediately next
>>>>> time
>>>>>
>>>>> What happens if it does go wrong:
>>>>> From what I gather, there seem to be occasions where the vhost either
>>>>> executes suprisingly quick, or the kworker surprinsingly slow. If
>>>>> these
>>>>> outliers reach critical values, it can happen, that
>>>>>vhost runtime < kworker runtime
>>>>> which now causes the kworker to get the negative lag.
>>>>>
>>>>> In this case it seems like, that the vhost is very fast in waking up
>>>>> the kworker. And coincidentally, the kworker takes, more time than
>>>>> usual
>>>>> to finish. We speak of 4-digit to low 5-digit nanoseconds.
>>>>>
>>>>> So, for these outliers, the scheduler extrapolates that the kworker
>>>>> out-consumes the vhost and should be slowed down, although in the
>>>>> majority
>>>>> of other cases this does not happen.
>>>>
>>>> Thanks for providing the above details Tobias. It does seem like EEVDF
>>>> is strict
>>>> about the eligibility checks and making tasks wait when their lags are
>>>> negative, even
>>>> if just a little bit as in the case of the kworker.
>>>>
>>>> There was a patch to disable the eligibility checks
>>>> (https://lore.kernel.org/lkml/20231013030213.2472697-1-youssefes...@chromium.org/),
>>>> which would make EEVDF more like EVDF, though the deadline comparison
>>>> would
>>>> probably still favor the vhost task instead of the kworker with the
>>>> negative lag.
>>>>
>>>> I'm not sure if you tried it, but I thought I'd mention it.
>>>
>>> Haven't seen that one yet. Unfortunately, it does not help to ignore the
>>> eligibility.
>>>
>>> I'm inclined to rather propose propose a documentation change, which
>>> describes that tasks should not rely on woken up tasks being scheduled
>>> immediately.
>>
>> Where do you see such an assumption ? Even before eevdf, there were
>> nothing that ensures such behavior. When using CFS (legacy or eevdf)
>> tasks, you can't know if the newly wakeup task will run 1st or not
>>
> 
> There was no guarantee of course. place_entity was reducing the vruntime of 
> woken up tasks though, giving it a slight boost, right?. For the scenario 
> that I observed, that boost was enough to make sure, that the woken up tasks 
> gets scheduled consistently. This might still not be true for all scenarios, 
> but in general EEVDF seems to be stricter with woken up tasks.

It seems that way, as EEVDF will do eligibility and deadline checks before 
scheduling a task, so
a task would have to satisfy both of those checks.

I think we have some special treatment for when a task initially joins the 
competition, in which
case we halve its slice. But I don't think there is any special treatment for 
woken tasks
anymore.

There was also a fix (63304558ba5dcaaff9e052ee43cfdcc7f9c29e85) to try to 
reduce the number of
wake up preemptions under some conditions, under the RUN_TO_PARITY feature.


Re: [RFC] sched/eevdf: sched feature to dismiss lag on wakeup

2024-03-18 Thread Luis Machado
On 3/14/24 13:45, Tobias Huschle wrote:
> On Fri, Mar 08, 2024 at 03:11:38PM +0000, Luis Machado wrote:
>> On 2/28/24 16:10, Tobias Huschle wrote:
>>>
>>> Questions:
>>> 1. The kworker getting its negative lag occurs in the following scenario
>>>- kworker and a cgroup are supposed to execute on the same CPU
>>>- one task within the cgroup is executing and wakes up the kworker
>>>- kworker with 0 lag, gets picked immediately and finishes its
>>>  execution within ~5000ns
>>>- on dequeue, kworker gets assigned a negative lag
>>>Is this expected behavior? With this short execution time, I would
>>>expect the kworker to be fine.
>>
>> That strikes me as a bit odd as well. Have you been able to determine how a 
>> negative lag
>> is assigned to the kworker after such a short runtime?
>>
> 
> I did some more trace reading though and found something.
> 
> What I observed if everything runs regularly:
> - vhost and kworker run alternating on the same CPU
> - if the kworker is done, it leaves the runqueue
> - vhost wakes up the kworker if it needs it
> --> this means:
>   - vhost starts alone on an otherwise empty runqueue
>   - it seems like it never gets dequeued
> (unless another unrelated task joins or migration hits)
>   - if vhost wakes up the kworker, the kworker gets selected
>   - vhost runtime > kworker runtime 
> --> kworker gets positive lag and gets selected immediately next time
> 
> What happens if it does go wrong:
> From what I gather, there seem to be occasions where the vhost either
> executes suprisingly quick, or the kworker surprinsingly slow. If these
> outliers reach critical values, it can happen, that
>vhost runtime < kworker runtime
> which now causes the kworker to get the negative lag.
> 
> In this case it seems like, that the vhost is very fast in waking up
> the kworker. And coincidentally, the kworker takes, more time than usual
> to finish. We speak of 4-digit to low 5-digit nanoseconds.
> 
> So, for these outliers, the scheduler extrapolates that the kworker 
> out-consumes the vhost and should be slowed down, although in the majority
> of other cases this does not happen.

Thanks for providing the above details Tobias. It does seem like EEVDF is strict
about the eligibility checks and making tasks wait when their lags are 
negative, even
if just a little bit as in the case of the kworker.

There was a patch to disable the eligibility checks 
(https://lore.kernel.org/lkml/20231013030213.2472697-1-youssefes...@chromium.org/),
which would make EEVDF more like EVDF, though the deadline comparison would
probably still favor the vhost task instead of the kworker with the negative 
lag.

I'm not sure if you tried it, but I thought I'd mention it.


Re: [RFC] sched/eevdf: sched feature to dismiss lag on wakeup

2024-03-08 Thread Luis Machado
Hi Tobias,

On 2/28/24 16:10, Tobias Huschle wrote:
> The previously used CFS scheduler gave tasks that were woken up an
> enhanced chance to see runtime immediately by deducting a certain value
> from its vruntime on runqueue placement during wakeup.
> 
> This property was used by some, at least vhost, to ensure, that certain
> kworkers are scheduled immediately after being woken up. The EEVDF
> scheduler, does not support this so far. Instead, if such a woken up
> entitiy carries a negative lag from its previous execution, it will have
> to wait for the current time slice to finish, which affects the
> performance of the process expecting the immediate execution negatively.
> 
> To address this issue, implement EEVDF strategy #2 for rejoining
> entities, which dismisses the lag from previous execution and allows
> the woken up task to run immediately (if no other entities are deemed
> to be preferred for scheduling by EEVDF).
> 
> The vruntime is decremented by an additional value of 1 to make sure,
> that the woken up tasks gets to actually run. This is of course not
> following strategy #2 in an exact manner but guarantees the expected
> behavior for the scenario described above. Without the additional
> decrement, the performance goes south even more. So there are some
> side effects I could not get my head around yet.
> 
> Questions:
> 1. The kworker getting its negative lag occurs in the following scenario
>- kworker and a cgroup are supposed to execute on the same CPU
>- one task within the cgroup is executing and wakes up the kworker
>- kworker with 0 lag, gets picked immediately and finishes its
>  execution within ~5000ns
>- on dequeue, kworker gets assigned a negative lag
>Is this expected behavior? With this short execution time, I would
>expect the kworker to be fine.

That strikes me as a bit odd as well. Have you been able to determine how a 
negative lag
is assigned to the kworker after such a short runtime?

I was looking at a different thread 
(https://lore.kernel.org/lkml/20240226082349.302363-1-yu.c.c...@intel.com/) that
uncovers a potential overflow in the eligibility calculation. Though I don't 
think that is the case for this particular
vhost problem.


Re: [Patch 0/6] [Patch 0/6] PPC64-HWBKPT: Hardware Breakpoint interfaces - ver VIII

2009-08-03 Thread Luis Machado
Hi,

On Fri, 2009-07-31 at 16:10 +1000, David Gibson wrote:
 On Mon, Jul 27, 2009 at 05:41:52AM +0530, K.Prasad wrote:
  Hi David,
  I'm back with a new version of patches after a brief hiatus!
  
  After much deliberation about modifying the code to change the timing of 
  signal
  delivery to user-space, it has been decided to retain the existing behaviour
  i.e. SIGTRAP delivered to user-space after execution of causative 
  instruction
  although exception is raised before execution of it.
 
 Ok.  Except, presumably for ptrace, since changing that would break
 gdb.

Yes. GDB works with some assumptions in mind. Hardware breakpoints
(they're not supported right now) trigger before execution, then GDB
just removes them and stepi's until we're past the breakpoint.

As for HW watchpoints, we always stop before execution (by ppc design
presumably). GDB will check the value before the trigger, remove the HW
watch, stepi it, and will check the value again to see if it has
changed.

If we're to trigger after the data load/store has happened, we could
have breakage in GDB.

Regards,
Luis

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] powerpc: clean up the Book-E HW watchpoint support

2008-07-25 Thread Luis Machado
Works for me.

I presume you had positive results on the Book-E as well.

By the way, thanks for cleaning it up.

Luis

On Fri, 2008-07-25 at 14:27 -0500, Kumar Gala wrote:
 * CONFIG_BOOKE is selected by CONFIG_44x so we dont need both
 * Fixed a few comments
 * Go back to only using DBCR0_IDM to determine if we are using
   debug resources.
 
 Signed-off-by: Kumar Gala [EMAIL PROTECTED]
 ---
 
 Luis, can you test this on 44x.  I don't expect any breakage.
 
 - k
 
  arch/powerpc/kernel/entry_32.S |6 +++---
  arch/powerpc/kernel/process.c  |8 
  arch/powerpc/kernel/ptrace.c   |7 ---
  arch/powerpc/kernel/signal.c   |2 +-
  4 files changed, 12 insertions(+), 11 deletions(-)
 
 diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
 index 81c8324..da52269 100644
 --- a/arch/powerpc/kernel/entry_32.S
 +++ b/arch/powerpc/kernel/entry_32.S
 @@ -148,7 +148,7 @@ transfer_to_handler:
   /* Check to see if the dbcr0 register is set up to debug.  Use the
  internal debug mode bit to do this. */
   lwz r12,THREAD_DBCR0(r12)
 - andis.  r12,r12,(DBCR0_IDM  | DBSR_DAC1R | DBSR_DAC1W)@h
 + andis.  r12,r12,[EMAIL PROTECTED]
   beq+3f
   /* From user and task is ptraced - load up global dbcr0 */
   li  r12,-1  /* clear all pending debug events */
 @@ -292,7 +292,7 @@ syscall_exit_cont:
   /* If the process has its own DBCR0 value, load it up.  The internal
  debug mode bit tells us that dbcr0 should be loaded. */
   lwz r0,THREAD+THREAD_DBCR0(r2)
 - andis.  r10,r0,(DBCR0_IDM  | DBSR_DAC1R | DBSR_DAC1W)@h
 + andis.  r10,r0,[EMAIL PROTECTED]
   bnel-   load_dbcr0
  #endif
  #ifdef CONFIG_44x
 @@ -720,7 +720,7 @@ restore_user:
   /* Check whether this process has its own DBCR0 value.  The internal
  debug mode bit tells us that dbcr0 should be loaded. */
   lwz r0,THREAD+THREAD_DBCR0(r2)
 - andis.  r10,r0,(DBCR0_IDM  | DBSR_DAC1R | DBSR_DAC1W)@h
 + andis.  r10,r0,[EMAIL PROTECTED]
   bnel-   load_dbcr0
  #endif
 
 diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
 index db2497c..e030f3b 100644
 --- a/arch/powerpc/kernel/process.c
 +++ b/arch/powerpc/kernel/process.c
 @@ -254,7 +254,7 @@ void do_dabr(struct pt_regs *regs, unsigned long address,
   return;
 
   /* Clear the DAC and struct entries.  One shot trigger */
 -#if (defined(CONFIG_44x) || defined(CONFIG_BOOKE))
 +#if defined(CONFIG_BOOKE)
   mtspr(SPRN_DBCR0, mfspr(SPRN_DBCR0)  ~(DBSR_DAC1R | DBSR_DAC1W
   | DBCR0_IDM));
  #endif
 @@ -286,7 +286,7 @@ int set_dabr(unsigned long dabr)
   mtspr(SPRN_DABR, dabr);
  #endif
 
 -#if defined(CONFIG_44x) || defined(CONFIG_BOOKE)
 +#if defined(CONFIG_BOOKE)
   mtspr(SPRN_DAC1, dabr);
  #endif
 
 @@ -373,7 +373,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
   if (unlikely(__get_cpu_var(current_dabr) != new-thread.dabr))
   set_dabr(new-thread.dabr);
 
 -#if defined(CONFIG_44x) || defined(CONFIG_BOOKE)
 +#if defined(CONFIG_BOOKE)
   /* If new thread DAC (HW breakpoint) is the same then leave it */
   if (new-thread.dabr)
   set_dabr(new-thread.dabr);
 @@ -568,7 +568,7 @@ void flush_thread(void)
   current-thread.dabr = 0;
   set_dabr(0);
 
 -#if defined(CONFIG_44x) || defined(CONFIG_BOOKE)
 +#if defined(CONFIG_BOOKE)
   current-thread.dbcr0 = ~(DBSR_DAC1R | DBSR_DAC1W);
  #endif
   }
 diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
 index a5d0e78..66204cb 100644
 --- a/arch/powerpc/kernel/ptrace.c
 +++ b/arch/powerpc/kernel/ptrace.c
 @@ -717,7 +717,7 @@ void user_disable_single_step(struct task_struct *task)
   struct pt_regs *regs = task-thread.regs;
 
 
 -#if defined(CONFIG_44x) || defined(CONFIG_BOOKE)
 +#if defined(CONFIG_BOOKE)
   /* If DAC then do not single step, skip */
   if (task-thread.dabr)
   return;
 @@ -744,10 +744,11 @@ int ptrace_set_debugreg(struct task_struct *task, 
 unsigned long addr,
   if (addr  0)
   return -EINVAL;
 
 + /* The bottom 3 bits in dabr are flags */
   if ((data  ~0x7UL) = TASK_SIZE)
   return -EIO;
 
 -#ifdef CONFIG_PPC64
 +#ifndef CONFIG_BOOKE
 
   /* For processors using DABR (i.e. 970), the bottom 3 bits are flags.
*  It was assumed, on previous implementations, that 3 bits were
 @@ -769,7 +770,7 @@ int ptrace_set_debugreg(struct task_struct *task, 
 unsigned long addr,
   task-thread.dabr = data;
 
  #endif
 -#if defined(CONFIG_44x) || defined(CONFIG_BOOKE)
 +#if defined(CONFIG_BOOKE)
 
   /* As described above, it was assumed 3 bits were passed with the data
*  address, but we will assume only the mode bits will be passed
 diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
 index 

Re: [RFC] 4xx hardware watchpoint support

2008-07-23 Thread Luis Machado
 Some comment, first the above negate conditional
 looks rather ugly, I'd rather do a
 
 #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE)
   dbcr0 case
 #else
   dabr case
 #endif

Yes, it makes sense. I'll switch it around.

 second I wonder why we have the notify_die only for one case, that seems
 rather odd.  Looking further the notify_die is even more odd because
 DIE_DABR_MATCH doesn't actually appear anywhere else in the kernel.
 I'd suggest simply removing it.

DIE_DABR_MATCH doesn't appear anywhere else because there is only a
single function responsible for handling the DABR/DAC events on powerPC
with this modification. It would make sense to call this to both the
DAC/DABR cases though (i.e. taking it out of the #ifdef), what do you
think?

 Can you redo this in the normal Linux comment style, ala:
 
   /*
* For processors using DABR (i.e. 970), the bottom 3 bits are flags.
*  It was assumed, on previous implementations, that 3 bits were
*  passed together with the data address, fitting the design of the
*  DABR register, as follows:
*
*  bit 0: Read flag
*  bit 1: Write flag
*  bit 2: Breakpoint translation
*
*  Thus, we use them here as so.
*/
 
 and similar in few other places.

Will do, thanks for reviewing this one.

Regards,
Luis

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC] 4xx hardware watchpoint support

2008-07-23 Thread Luis Machado
On Wed, 2008-07-23 at 11:53 -0400, Josh Boyer wrote:
 Shouldn't this (and other places) be:
 
 #if defined(CONFIG_44x) || defined(CONFIG_BOOKE)
 
 if you are going to exclude 40x for now?  Otherwise this is still
 enabled on 405 and setting the wrong register.
 
 josh

Yes, sorry. I wasn't aware of this specific define value. It makes
things easier to support 405's later.

Like so?

This addresses Christoph's comments as well.


Signed-off-by: Luis Machado [EMAIL PROTECTED]

Index: linux-2.6.26/arch/powerpc/kernel/entry_32.S
===
--- linux-2.6.26.orig/arch/powerpc/kernel/entry_32.S2008-07-23 
07:44:57.0 -0700
+++ linux-2.6.26/arch/powerpc/kernel/entry_32.S 2008-07-23 07:50:31.0 
-0700
@@ -148,7 +148,7 @@
/* Check to see if the dbcr0 register is set up to debug.  Use the
   internal debug mode bit to do this. */
lwz r12,THREAD_DBCR0(r12)
-   andis.  r12,r12,[EMAIL PROTECTED]
+   andis.  r12,r12,(DBCR0_IDM  | DBSR_DAC1R | DBSR_DAC1W)@h
beq+3f
/* From user and task is ptraced - load up global dbcr0 */
li  r12,-1  /* clear all pending debug events */
@@ -292,7 +292,7 @@
/* If the process has its own DBCR0 value, load it up.  The internal
   debug mode bit tells us that dbcr0 should be loaded. */
lwz r0,THREAD+THREAD_DBCR0(r2)
-   andis.  r10,r0,[EMAIL PROTECTED]
+   andis.  r10,r0,(DBCR0_IDM  | DBSR_DAC1R | DBSR_DAC1W)@h
bnel-   load_dbcr0
 #endif
 #ifdef CONFIG_44x
@@ -720,7 +720,7 @@
/* Check whether this process has its own DBCR0 value.  The internal
   debug mode bit tells us that dbcr0 should be loaded. */
lwz r0,THREAD+THREAD_DBCR0(r2)
-   andis.  r10,r0,[EMAIL PROTECTED]
+   andis.  r10,r0,(DBCR0_IDM  | DBSR_DAC1R | DBSR_DAC1W)@h
bnel-   load_dbcr0
 #endif
 
Index: linux-2.6.26/arch/powerpc/kernel/process.c
===
--- linux-2.6.26.orig/arch/powerpc/kernel/process.c 2008-07-23 
07:44:57.0 -0700
+++ linux-2.6.26/arch/powerpc/kernel/process.c  2008-07-23 07:50:31.0 
-0700
@@ -47,6 +47,8 @@
 #ifdef CONFIG_PPC64
 #include asm/firmware.h
 #endif
+#include linux/kprobes.h
+#include linux/kdebug.h
 
 extern unsigned long _get_SP(void);
 
@@ -239,6 +241,35 @@
 }
 #endif /* CONFIG_SMP */
 
+void do_dabr(struct pt_regs *regs, unsigned long address,
+   unsigned long error_code)
+{
+   siginfo_t info;
+
+   if (notify_die(DIE_DABR_MATCH, dabr_match, regs, error_code,
+   11, SIGSEGV) == NOTIFY_STOP)
+   return;
+
+   if (debugger_dabr_match(regs))
+   return;
+
+   /* Clear the DAC and struct entries.  One shot trigger */
+#if (defined(CONFIG_44x) || defined(CONFIG_BOOKE))
+   mtspr(SPRN_DBCR0, mfspr(SPRN_DBCR0)  ~(DBSR_DAC1R | DBSR_DAC1W
+   | DBCR0_IDM));
+#endif
+
+   /* Clear the DABR */
+   set_dabr(0);
+
+   /* Deliver the signal to userspace */
+   info.si_signo = SIGTRAP;
+   info.si_errno = 0;
+   info.si_code = TRAP_HWBKPT;
+   info.si_addr = (void __user *)address;
+   force_sig_info(SIGTRAP, info, current);
+}
+
 static DEFINE_PER_CPU(unsigned long, current_dabr);
 
 int set_dabr(unsigned long dabr)
@@ -254,6 +285,11 @@
 #if defined(CONFIG_PPC64) || defined(CONFIG_6xx)
mtspr(SPRN_DABR, dabr);
 #endif
+
+#if defined(CONFIG_44x) || defined(CONFIG_BOOKE)
+   mtspr(SPRN_DAC1, dabr);
+#endif
+
return 0;
 }
 
@@ -337,6 +373,12 @@
if (unlikely(__get_cpu_var(current_dabr) != new-thread.dabr))
set_dabr(new-thread.dabr);
 
+#if defined(CONFIG_44x) || defined(CONFIG_BOOKE)
+   /* If new thread DAC (HW breakpoint) is the same then leave it */
+   if (new-thread.dabr)
+   set_dabr(new-thread.dabr);
+#endif
+
new_thread = new-thread;
old_thread = current-thread;
 
@@ -525,6 +567,10 @@
if (current-thread.dabr) {
current-thread.dabr = 0;
set_dabr(0);
+
+#if defined(CONFIG_44x) || defined(CONFIG_BOOKE)
+   current-thread.dbcr0 = ~(DBSR_DAC1R | DBSR_DAC1W);
+#endif
}
 }
 
Index: linux-2.6.26/arch/powerpc/kernel/ptrace.c
===
--- linux-2.6.26.orig/arch/powerpc/kernel/ptrace.c  2008-07-23 
07:44:57.0 -0700
+++ linux-2.6.26/arch/powerpc/kernel/ptrace.c   2008-07-23 07:53:45.0 
-0700
@@ -703,7 +703,7 @@
 
if (regs != NULL) {
 #if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
-   task-thread.dbcr0 = DBCR0_IDM | DBCR0_IC;
+   task-thread.dbcr0 |= DBCR0_IDM | DBCR0_IC;
regs-msr |= MSR_DE;
 #else
regs-msr |= MSR_SE;
@@ -716,9 +716,16 @@
 {
struct

Re: [RFC] 4xx hardware watchpoint support

2008-07-22 Thread Luis Machado
Hi,

 That, or adding a small function to move the bits to the appropriate
 registers (set_dbcr or set_dac_events).
 
  Do you think it's worth to support this facility on 405's processors? If
  so, i'll gladly work on a solution to it.
 
 I would think so.  There's really no difference from a userspace
 perspective, so gdb watchpoints could be valuable there too.  I'll
 leave it up to you though.

As the 440 support is ready and the 405 needs additional tweaking due to
the use of DBCR1 instead of DBCR0 and due to a different position scheme
of the DAC1R/DAC1W flags inside DBCR1, i'd say we should include this
code and handle the 405 case later. 

We might have to handle it anyway if we're going to pursue the hardware
breakpoint interface work in the future.

I've fixed some formatting problems. Tested on a 440 Taishan board and
on a PPC970. Both worked as they should. Ok?


Signed-off-by: Luis Machado [EMAIL PROTECTED]

Index: linux-2.6.26/arch/powerpc/kernel/process.c
===
--- linux-2.6.26.orig/arch/powerpc/kernel/process.c 2008-07-20 
16:56:57.0 -0700
+++ linux-2.6.26/arch/powerpc/kernel/process.c  2008-07-22 16:46:36.0 
-0700
@@ -47,6 +47,8 @@
 #ifdef CONFIG_PPC64
 #include asm/firmware.h
 #endif
+#include linux/kprobes.h
+#include linux/kdebug.h
 
 extern unsigned long _get_SP(void);
 
@@ -239,6 +241,36 @@
 }
 #endif /* CONFIG_SMP */
 
+void do_dabr(struct pt_regs *regs, unsigned long address,
+   unsigned long error_code)
+{
+   siginfo_t info;
+
+#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
+   if (notify_die(DIE_DABR_MATCH, dabr_match, regs, error_code,
+   11, SIGSEGV) == NOTIFY_STOP)
+   return;
+
+   if (debugger_dabr_match(regs))
+   return;
+
+   /* Clear the DAC and struct entries.  One shot trigger.  */
+#else /* (defined(CONFIG_4xx) || defined(CONFIG_BOOKE)) */
+   mtspr(SPRN_DBCR0, mfspr(SPRN_DBCR0)  ~(DBSR_DAC1R | DBSR_DAC1W
+   | DBCR0_IDM));
+#endif
+
+   /* Clear the DABR */
+   set_dabr(0);
+
+   /* Deliver the signal to userspace */
+   info.si_signo = SIGTRAP;
+   info.si_errno = 0;
+   info.si_code = TRAP_HWBKPT;
+   info.si_addr = (void __user *)address;
+   force_sig_info(SIGTRAP, info, current);
+}
+
 static DEFINE_PER_CPU(unsigned long, current_dabr);
 
 int set_dabr(unsigned long dabr)
@@ -254,6 +286,11 @@
 #if defined(CONFIG_PPC64) || defined(CONFIG_6xx)
mtspr(SPRN_DABR, dabr);
 #endif
+
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+   mtspr(SPRN_DAC1, dabr);
+#endif
+
return 0;
 }
 
@@ -337,6 +374,12 @@
if (unlikely(__get_cpu_var(current_dabr) != new-thread.dabr))
set_dabr(new-thread.dabr);
 
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+   /* If new thread DAC (HW breakpoint) is the same then leave it.  */
+   if (new-thread.dabr)
+   set_dabr(new-thread.dabr);
+#endif
+
new_thread = new-thread;
old_thread = current-thread;
 
@@ -525,6 +568,10 @@
if (current-thread.dabr) {
current-thread.dabr = 0;
set_dabr(0);
+
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+   current-thread.dbcr0 = ~(DBSR_DAC1R | DBSR_DAC1W);
+#endif
}
 }
 
Index: linux-2.6.26/arch/powerpc/kernel/ptrace.c
===
--- linux-2.6.26.orig/arch/powerpc/kernel/ptrace.c  2008-07-20 
16:56:57.0 -0700
+++ linux-2.6.26/arch/powerpc/kernel/ptrace.c   2008-07-22 16:41:24.0 
-0700
@@ -703,7 +703,7 @@
 
if (regs != NULL) {
 #if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
-   task-thread.dbcr0 = DBCR0_IDM | DBCR0_IC;
+   task-thread.dbcr0 |= DBCR0_IDM | DBCR0_IC;
regs-msr |= MSR_DE;
 #else
regs-msr |= MSR_SE;
@@ -716,9 +716,16 @@
 {
struct pt_regs *regs = task-thread.regs;
 
+
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+   /* If DAC then do not single step, skip.  */
+   if (task-thread.dabr)
+   return;
+#endif
+
if (regs != NULL) {
 #if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
-   task-thread.dbcr0 = 0;
+   task-thread.dbcr0 = ~(DBCR0_IC | DBCR0_IDM);
regs-msr = ~MSR_DE;
 #else
regs-msr = ~MSR_SE;
@@ -727,22 +734,71 @@
clear_tsk_thread_flag(task, TIF_SINGLESTEP);
 }
 
-static int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
+int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
   unsigned long data)
 {
-   /* We only support one DABR and no IABRS at the moment */
+   /* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
+  For embedded processors we support one DAC

Re: [RFC] 4xx hardware watchpoint support

2008-07-21 Thread Luis Machado

 This doesn't look right for how it's coded.  This would be the
 CONFIG_4xx || CONFIG_BOOKE case, but CONFIG_4xx includes PowerPC 405.
 That has a different bit layout among the DBCR registers.  Namely, on
 405 you would be clearing the TDE and IAC1 events because the DAC
 events are in DBCR1, not DBCR0.

Maybe guarding the 405-specific parts in a separate #if
defined(CONFIG_40x) block will do?

Do you think it's worth to support this facility on 405's processors? If
so, i'll gladly work on a solution to it.

Regards,
Luis

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC] 4xx hardware watchpoint support

2008-06-30 Thread Luis Machado
Hi guys,

Did anyone have a chance to go over this patch? Looking forward to
receive feedbacks on it.

Thanks!

Regards,
Luis

On Fri, 2008-06-20 at 17:14 -0300, Luis Machado wrote:
 On Thu, 2008-05-22 at 13:51 +1000, Paul Mackerras wrote:
  Luis Machado writes:
  
   This is a patch that has been sitting idle for quite some time. I
   decided to move it further because it is something useful. It was
   originally written by Michel Darneille, based off of 2.6.16.
   
   The original patch, though, was not compatible with the current DABR
   logic. DABR's are used to implement hardware watchpoint support for
   ppc64 processors (i.e. 970, Power5 etc). 4xx's have a different
   debugging register layout and needs to be handled differently (they two
   registers: DAC and DBCR0).
  
  Yes, they are different, but they do essentially the same thing, so I
  think we should try and unify the handling of the two.  Maybe you
  could rename set_dabr() to set_hw_watchpoint() or something and make
  it set DABR on 64-bit and classic 32-bit processors, and DAC on
  4xx/Book E processors.
  
  Likewise, I don't think we need both a dabr field and a dac field
  in the thread_struct - one should do.  Rename dabr to something else
  if you feel that the dabr name is too ppc64-specific.  And I don't
  think we need both __debugger_dabr_match and __debugger_dac_match.
  
   5) This is something i'm worried about for future features. We currently
   have a way to support only Hardware Watchpoints, but not Hardware
   Breakpoints (on 64-bit processors that have a IABR register or 32-bit
   processors carrying the IAC register). Looking at the code, we don't
   differentiate a watchpoint from a breakpoint request. A ptrace call has
   currently 3 arguments: REQUEST, ADDR and DATA. We use REQUEST and DATA
   to set a hardware watchpoint. Maybe we could use the ADDR parameter to
   set a hardware breakpoint? Or use it to tell the kernel whether we want
   a hardware watchpoint or hardware breakpoint and then pass the address
   of the instruction/data through the DATA parameter? What do you think?
  
  I would think there would be a different REQUEST value to mean set a
  hardware breakpoint.  Roland McGrath (cc'd) might be able to tell us
  what other architectures do.
  
  Paul.
 
 Paul,
 
 Follows a re-worked patch that unifies the handling of hardware
 watchpoint structures for DABR-based and DAC-based processors.
 
 The flow is exactly the same for DABR-based processors.
 
 As for the DAC-based code, i've eliminated the dac field from the
 thread structure, since now we use the dabr field to represent DAC1 of
 the DAC-based processors. As a consequence, i removed all the
 occurrences of dac throughout the code, using dabr in those cases.
 
 The function set_dabr sets the correct register (DABR OR DAC) for each
 specific processor now, instead of only setting the value for DABR-based
 processors.
 
 do_dac was merged with do_dabr as they were mostly equivalent pieces
 of code. I've moved do_dabr to arch/powerpc/kernel/process.c so it
 is visible for DAC-based code as well.
 
 Tested on a Taishan 440GX with success.
 
 Is it OK to leave it as dabr for both DABR and DAC? What do you think
 of the patch overall?
 
 Thanks,
 
 Best regards,
 Luis
 
 
 Index: linux-2.6.25-oldpatch/arch/powerpc/kernel/process.c
 ===
 --- linux-2.6.25-oldpatch.orig/arch/powerpc/kernel/process.c  2008-06-20 
 02:48:19.0 -0700
 +++ linux-2.6.25-oldpatch/arch/powerpc/kernel/process.c   2008-06-20 
 02:51:16.0 -0700
 @@ -241,6 +241,35 @@
  }
  #endif /* CONFIG_SMP */
 
 +void do_dabr(struct pt_regs *regs, unsigned long address,
 + unsigned long error_code)
 +{
 + siginfo_t info;
 +
 +#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
 + if (notify_die(DIE_DABR_MATCH, dabr_match, regs, error_code,
 + 11, SIGSEGV) == NOTIFY_STOP)
 + return;
 +
 + if (debugger_dabr_match(regs))
 + return;
 +#else
 +/* Clear the DAC and struct entries.  One shot trigger.  */
 +mtspr(SPRN_DBCR0, mfspr(SPRN_DBCR0)  ~(DBSR_DAC1R | DBSR_DAC1W
 +| DBCR0_IDM));
 +#endif
 +
 + /* Clear the DABR */
 + set_dabr(0);
 +
 + /* Deliver the signal to userspace */
 + info.si_signo = SIGTRAP;
 + info.si_errno = 0;
 + info.si_code = TRAP_HWBKPT;
 + info.si_addr = (void __user *)address;
 + force_sig_info(SIGTRAP, info, current);
 +}
 +
  static DEFINE_PER_CPU(unsigned long, current_dabr);
 
  int set_dabr(unsigned long dabr)
 @@ -256,6 +285,11 @@
  #if defined(CONFIG_PPC64) || defined(CONFIG_6xx)
   mtspr(SPRN_DABR, dabr);
  #endif
 +
 +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
 + mtspr(SPRN_DAC1, dabr);
 +#endif
 +
   return 0;
  }
 
 @@ -330,6 +364,12 @@
   if (unlikely(__get_cpu_var(current_dabr) != new

Re: [RFC] 4xx hardware watchpoint support

2008-06-20 Thread Luis Machado

On Thu, 2008-05-22 at 13:51 +1000, Paul Mackerras wrote:
 Luis Machado writes:
 
  This is a patch that has been sitting idle for quite some time. I
  decided to move it further because it is something useful. It was
  originally written by Michel Darneille, based off of 2.6.16.
  
  The original patch, though, was not compatible with the current DABR
  logic. DABR's are used to implement hardware watchpoint support for
  ppc64 processors (i.e. 970, Power5 etc). 4xx's have a different
  debugging register layout and needs to be handled differently (they two
  registers: DAC and DBCR0).
 
 Yes, they are different, but they do essentially the same thing, so I
 think we should try and unify the handling of the two.  Maybe you
 could rename set_dabr() to set_hw_watchpoint() or something and make
 it set DABR on 64-bit and classic 32-bit processors, and DAC on
 4xx/Book E processors.
 
 Likewise, I don't think we need both a dabr field and a dac field
 in the thread_struct - one should do.  Rename dabr to something else
 if you feel that the dabr name is too ppc64-specific.  And I don't
 think we need both __debugger_dabr_match and __debugger_dac_match.
 
  5) This is something i'm worried about for future features. We currently
  have a way to support only Hardware Watchpoints, but not Hardware
  Breakpoints (on 64-bit processors that have a IABR register or 32-bit
  processors carrying the IAC register). Looking at the code, we don't
  differentiate a watchpoint from a breakpoint request. A ptrace call has
  currently 3 arguments: REQUEST, ADDR and DATA. We use REQUEST and DATA
  to set a hardware watchpoint. Maybe we could use the ADDR parameter to
  set a hardware breakpoint? Or use it to tell the kernel whether we want
  a hardware watchpoint or hardware breakpoint and then pass the address
  of the instruction/data through the DATA parameter? What do you think?
 
 I would think there would be a different REQUEST value to mean set a
 hardware breakpoint.  Roland McGrath (cc'd) might be able to tell us
 what other architectures do.
 
 Paul.

Paul,

Follows a re-worked patch that unifies the handling of hardware
watchpoint structures for DABR-based and DAC-based processors.

The flow is exactly the same for DABR-based processors.

As for the DAC-based code, i've eliminated the dac field from the
thread structure, since now we use the dabr field to represent DAC1 of
the DAC-based processors. As a consequence, i removed all the
occurrences of dac throughout the code, using dabr in those cases.

The function set_dabr sets the correct register (DABR OR DAC) for each
specific processor now, instead of only setting the value for DABR-based
processors.

do_dac was merged with do_dabr as they were mostly equivalent pieces
of code. I've moved do_dabr to arch/powerpc/kernel/process.c so it
is visible for DAC-based code as well.

Tested on a Taishan 440GX with success.

Is it OK to leave it as dabr for both DABR and DAC? What do you think
of the patch overall?

Thanks,

Best regards,
Luis


Index: linux-2.6.25-oldpatch/arch/powerpc/kernel/process.c
===
--- linux-2.6.25-oldpatch.orig/arch/powerpc/kernel/process.c2008-06-20 
02:48:19.0 -0700
+++ linux-2.6.25-oldpatch/arch/powerpc/kernel/process.c 2008-06-20 
02:51:16.0 -0700
@@ -241,6 +241,35 @@
 }
 #endif /* CONFIG_SMP */
 
+void do_dabr(struct pt_regs *regs, unsigned long address,
+   unsigned long error_code)
+{
+   siginfo_t info;
+
+#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
+   if (notify_die(DIE_DABR_MATCH, dabr_match, regs, error_code,
+   11, SIGSEGV) == NOTIFY_STOP)
+   return;
+
+   if (debugger_dabr_match(regs))
+   return;
+#else
+/* Clear the DAC and struct entries.  One shot trigger.  */
+mtspr(SPRN_DBCR0, mfspr(SPRN_DBCR0)  ~(DBSR_DAC1R | DBSR_DAC1W
+| DBCR0_IDM));
+#endif
+
+   /* Clear the DABR */
+   set_dabr(0);
+
+   /* Deliver the signal to userspace */
+   info.si_signo = SIGTRAP;
+   info.si_errno = 0;
+   info.si_code = TRAP_HWBKPT;
+   info.si_addr = (void __user *)address;
+   force_sig_info(SIGTRAP, info, current);
+}
+
 static DEFINE_PER_CPU(unsigned long, current_dabr);
 
 int set_dabr(unsigned long dabr)
@@ -256,6 +285,11 @@
 #if defined(CONFIG_PPC64) || defined(CONFIG_6xx)
mtspr(SPRN_DABR, dabr);
 #endif
+
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+   mtspr(SPRN_DAC1, dabr);
+#endif
+
return 0;
 }
 
@@ -330,6 +364,12 @@
if (unlikely(__get_cpu_var(current_dabr) != new-thread.dabr))
set_dabr(new-thread.dabr);
 
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+   /* If new thread DAC (HW breakpoint) is the same then leave it.  */
+   if (new-thread.dabr)
+   set_dabr(new-thread.dabr);
+#endif

Re: [PATCH 5/6] ibm_newemac: PowerPC 440GX EMAC PHY clock workaround

2008-06-13 Thread Luis Machado
Did this show up on 2.6.26 recently? Or we're expecting it to show up a
bit later?

Thanks,
Luis

On Tue, 2008-04-22 at 10:46 +1000, Benjamin Herrenschmidt wrote:
 From: Valentine Barshak [EMAIL PROTECTED]
 
 The PowerPC 440GX Taishan board fails to reset EMAC3 (reset timeout error)
 if there's no link. Because of that it fails to find PHY chip. The older 
 ibm_emac
 driver had a workaround for that: the EMAC_CLK_INTERNAL/EMAC_CLK_EXTERNAL 
 macros,
 which toggle the Ethernet Clock Select bit in the SDR0_MFR register. This 
 patch
 does the same for ibm,emac-440gx compatible chips. The workaround forces
 clock on -all- EMACs, so we select clock under global emac_phy_map_lock.
 
 BenH: Made that #ifdef CONFIG_PPC_DCR_NATIVE for now as dcri_* stuff doesn't
 exist for MMIO type DCRs like Cell. Some future rework  improvements of the
 DCR infrastructure will make that cleaner but for now, this makes it work.
 
 Signed-off-by: Valentine Barshak [EMAIL PROTECTED]
 Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED]
 
 ---
  drivers/net/ibm_newemac/core.c |   18 +-
  drivers/net/ibm_newemac/core.h |8 ++--
  2 files changed, 23 insertions(+), 3 deletions(-)
 
 --- linux-work.orig/drivers/net/ibm_newemac/core.c2008-04-22 
 10:11:59.0 +1000
 +++ linux-work/drivers/net/ibm_newemac/core.c 2008-04-22 10:31:18.0 
 +1000
 @@ -43,6 +43,8 @@
  #include asm/io.h
  #include asm/dma.h
  #include asm/uaccess.h
 +#include asm/dcr.h
 +#include asm/dcr-regs.h
 
  #include core.h
 
 @@ -2333,6 +2335,11 @@ static int __devinit emac_init_phy(struc
   dev-phy.mdio_read = emac_mdio_read;
   dev-phy.mdio_write = emac_mdio_write;
 
 + /* Enable internal clock source */
 +#ifdef CONFIG_PPC_DCR_NATIVE
 + if (emac_has_feature(dev, EMAC_FTR_440GX_PHY_CLK_FIX))
 + dcri_clrset(SDR0, SDR0_MFR, 0, SDR0_MFR_ECS);
 +#endif
   /* Configure EMAC with defaults so we can at least use MDIO
* This is needed mostly for 440GX
*/
 @@ -2365,6 +2372,12 @@ static int __devinit emac_init_phy(struc
   if (!emac_mii_phy_probe(dev-phy, i))
   break;
   }
 +
 + /* Enable external clock source */
 +#ifdef CONFIG_PPC_DCR_NATIVE
 + if (emac_has_feature(dev, EMAC_FTR_440GX_PHY_CLK_FIX))
 + dcri_clrset(SDR0, SDR0_MFR, SDR0_MFR_ECS, 0);
 +#endif
   mutex_unlock(emac_phy_map_lock);
   if (i == 0x20) {
   printk(KERN_WARNING %s: can't find PHY!\n, np-full_name);
 @@ -2490,8 +2503,11 @@ static int __devinit emac_init_config(st
   }
 
   /* Check EMAC version */
 - if (of_device_is_compatible(np, ibm,emac4))
 + if (of_device_is_compatible(np, ibm,emac4)) {
   dev-features |= EMAC_FTR_EMAC4;
 + if (of_device_is_compatible(np, ibm,emac-440gx))
 + dev-features |= EMAC_FTR_440GX_PHY_CLK_FIX;
 + }
 
   /* Fixup some feature bits based on the device tree */
   if (of_get_property(np, has-inverted-stacr-oc, NULL))
 Index: linux-work/drivers/net/ibm_newemac/core.h
 ===
 --- linux-work.orig/drivers/net/ibm_newemac/core.h2008-04-22 
 10:06:31.0 +1000
 +++ linux-work/drivers/net/ibm_newemac/core.h 2008-04-22 10:30:03.0 
 +1000
 @@ -301,6 +301,10 @@ struct emac_instance {
   * Set if we have new type STACR with STAOPC
   */
  #define EMAC_FTR_HAS_NEW_STACR   0x0040
 +/*
 + * Set if we need phy clock workaround for 440gx
 + */
 +#define EMAC_FTR_440GX_PHY_CLK_FIX   0x0080
 
 
  /* Right now, we don't quite handle the always/possible masks on the
 @@ -312,8 +316,8 @@ enum {
 
   EMAC_FTRS_POSSIBLE  =
  #ifdef CONFIG_IBM_NEW_EMAC_EMAC4
 - EMAC_FTR_EMAC4  | EMAC_FTR_HAS_NEW_STACR|
 - EMAC_FTR_STACR_OC_INVERT|
 + EMAC_FTR_EMAC4 | EMAC_FTR_HAS_NEW_STACR |
 + EMAC_FTR_STACR_OC_INVERT | EMAC_FTR_440GX_PHY_CLK_FIX |
  #endif
  #ifdef CONFIG_IBM_NEW_EMAC_TAH
   EMAC_FTR_HAS_TAH|
 ___
 Linuxppc-dev mailing list
 Linuxppc-dev@ozlabs.org
 https://ozlabs.org/mailman/listinfo/linuxppc-dev
-- 
Luis Machado
Software Engineer 
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC] 4xx hardware watchpoint support

2008-05-23 Thread Luis Machado
On Thu, 2008-05-22 at 13:51 +1000, Paul Mackerras wrote:
 Luis Machado writes:
 
  This is a patch that has been sitting idle for quite some time. I
  decided to move it further because it is something useful. It was
  originally written by Michel Darneille, based off of 2.6.16.
  
  The original patch, though, was not compatible with the current DABR
  logic. DABR's are used to implement hardware watchpoint support for
  ppc64 processors (i.e. 970, Power5 etc). 4xx's have a different
  debugging register layout and needs to be handled differently (they two
  registers: DAC and DBCR0).
 
 Yes, they are different, but they do essentially the same thing, so I
 think we should try and unify the handling of the two.  Maybe you
 could rename set_dabr() to set_hw_watchpoint() or something and make
 it set DABR on 64-bit and classic 32-bit processors, and DAC on
 4xx/Book E processors.
 
 Likewise, I don't think we need both a dabr field and a dac field
 in the thread_struct - one should do.  Rename dabr to something else
 if you feel that the dabr name is too ppc64-specific.  And I don't
 think we need both __debugger_dabr_match and __debugger_dac_match.
 

Thanks for the feedback Paul. I'll try consolidating those mechanisms
into a single, more general scheme. 

Best regards,
Luis

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC] 4xx hardware watchpoint support

2008-05-23 Thread Luis Machado

On Wed, 2008-05-21 at 23:46 -0700, Roland McGrath wrote:
  I would think there would be a different REQUEST value to mean set a
  hardware breakpoint.  Roland McGrath (cc'd) might be able to tell us
  what other architectures do.
 
 Other architectures don't give a good model to follow.  (If anything,
 they just trivally virtualize their own idiosyncratic hardware.)
 
 What I want to see done for this in the future is reviving and
 finishing the hw_breakpoint work begun by Alan Stern, and porting
 that to each arch's particular hardware features.  On that we'd
 build any new interfaces in abstract machine-independent terms,
 just describing the constraints of what the hardware can do,
 rather than having the user interface involve mimicking hardware
 encodings.  (The existing hardware-idiosyncratic ptrace interfaces
 would tie into hw_breakpoint for backward compatibility.)
 
 

Kumar was just mentioning this post a few messages ago:
http://ozlabs.org/pipermail/linuxppc-dev/2008-May/055745.html

That is a very interesting approach to handle all the differences
between each processor's architecture, and a much cleaner way to set the
facilities we want than the current interface we have. Do you know what
is the status of this work? Did it move any further?

Best Regards,
Luis

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[RFC] 4xx hardware watchpoint support

2008-05-21 Thread Luis Machado
Hi,

This is a patch that has been sitting idle for quite some time. I
decided to move it further because it is something useful. It was
originally written by Michel Darneille, based off of 2.6.16.

The original patch, though, was not compatible with the current DABR
logic. DABR's are used to implement hardware watchpoint support for
ppc64 processors (i.e. 970, Power5 etc). 4xx's have a different
debugging register layout and needs to be handled differently (they two
registers: DAC and DBCR0).

I've refreshed the patch to a recent stable release (2.6.25.1, still
apllies cleanly on 2.6.25.4), made the patch compatible with both 4xx
and ppc64 processor designs, fixed some masks that didn't seem correct
(the ones setting hw watch read/write modes) and refactored some of the
code.

Though, i'm still not happy enough with the patch as i think we could
improve it a bit further. Some points i consider worth of attention:

1) There is a do_dac(...) implementation inside
arch/powerpc/kernel/traps.c. I don't feel this is correct. I see that
the 64-bit counterpart, do_dabr(...), is implemented inside
arch/powerpc/mm/fault.c. Due to do_dac(...) being implemented inside
traps.c, we need to externalize the declaration for get_dac(...) on
include/asm-[powerpc|ppc]/system.h so it's made visible to that scope.
We could use mfspr(...) to get the register's contents directly, but
then i wouldn't make sense to have get_dac(...) in the first place.
Maybe moving the do_dac(...) code to arch/powerpc/mm/fault.c would make
more sense since we seem to have the address already, and won't need to
call get_dac(...) to get it.

2) The change to make set_debugreg(...) and get_debugreg(...)
transparent for both DAC-driven and DABR-driven processors is OK. But
that shouldn't require us to externalize the declaration of
set_debugreg(...) in order to use it in arch/powerpc/kernel/traps.c
right? Maybe this has some relationship with the above point.

3) Maybe it would be better to come up with a way to merge both DABR and
DAC/DBCR0 logic in order to get rid of dozens of processor-specific
#ifdef's? This could be a bit more complex since it would require
re-writing good portions of code.

4) Should i use CONFIG_40x ou CONFIG_4xx instead? Would CONFIG_4xx
automatically include 40x's? I'm mainly targetting 4xx's here, though
40x's should be similar except for 403.

5) This is something i'm worried about for future features. We currently
have a way to support only Hardware Watchpoints, but not Hardware
Breakpoints (on 64-bit processors that have a IABR register or 32-bit
processors carrying the IAC register). Looking at the code, we don't
differentiate a watchpoint from a breakpoint request. A ptrace call has
currently 3 arguments: REQUEST, ADDR and DATA. We use REQUEST and DATA
to set a hardware watchpoint. Maybe we could use the ADDR parameter to
set a hardware breakpoint? Or use it to tell the kernel whether we want
a hardware watchpoint or hardware breakpoint and then pass the address
of the instruction/data through the DATA parameter? What do you think?

I appreciate any comments about these items and the patch itself.

Best regards.
Luis
Index: linux-2.6.25.1/arch/powerpc/kernel/process.c
===
--- linux-2.6.25.1.orig/arch/powerpc/kernel/process.c	2008-05-21 07:25:45.0 -0700
+++ linux-2.6.25.1/arch/powerpc/kernel/process.c	2008-05-21 07:26:55.0 -0700
@@ -219,6 +219,28 @@
 }
 #endif /* CONFIG_SPE */
 
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+
+/* Add support for HW Debug breakpoint.  Use DAC register.  */
+int set_dac(unsigned long dac)
+{
+	mtspr(SPRN_DAC1, dac);
+
+	return 0;
+}
+unsigned int get_dac()
+{
+	return mfspr(SPRN_DAC1);
+}
+int set_dbcr0(unsigned long dbcr)
+{
+	mtspr(SPRN_DBCR0, dbcr);
+
+	return 0;
+}
+
+#endif
+
 #ifndef CONFIG_SMP
 /*
  * If we are doing lazy switching of CPU state (FP, altivec or SPE),
@@ -330,6 +352,13 @@
 	if (unlikely(__get_cpu_var(current_dabr) != new-thread.dabr))
 		set_dabr(new-thread.dabr);
 
+
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+	/* If new thread DAC (HW breakpoint) is the same then leave it.  */
+	if (new-thread.dac)
+		set_dac(new-thread.dac);
+#endif
+
 	new_thread = new-thread;
 	old_thread = current-thread;
 
@@ -515,6 +544,16 @@
 
 	discard_lazy_cpu_state();
 
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+	if (current-thread.dac) {
+		current-thread.dac = 0;
+		/* Clear debug control.  */
+		current-thread.dbcr0 = ~(DBSR_DAC1R | DBSR_DAC1W);
+
+		set_dac(0);
+	}
+#endif
+
 	if (current-thread.dabr) {
 		current-thread.dabr = 0;
 		set_dabr(0);
Index: linux-2.6.25.1/arch/powerpc/kernel/ptrace.c
===
--- linux-2.6.25.1.orig/arch/powerpc/kernel/ptrace.c	2008-05-21 07:25:45.0 -0700
+++ linux-2.6.25.1/arch/powerpc/kernel/ptrace.c	2008-05-21 08:23:34.0 -0700
@@ -629,9 +629,15 @@
 {
 	struct pt_regs *regs = 

Re: [RFC] 4xx hardware watchpoint support

2008-05-21 Thread Luis Machado
Thanks for the inlining tip. It should be now. :-)

So, basically we are looking at a cleaner and much better interface to
set such hardware features? That's something that would greatly improve
the communication from, say, GDB to the kernel regarding these
facilities.

Regards,
Luis

On Wed, 2008-05-21 at 16:16 -0500, Kumar Gala wrote:
 Two real quick notes.
 
 Take a look at:
 
 http://ozlabs.org/pipermail/linuxppc-dev/2008-May/055745.html
 
 and can you try and post the patch inline next time.  Hard to provide  
 review comments on it :)
 
 - kIndex: linux-2.6.25.1/arch/powerpc/kernel/process.c

===
--- linux-2.6.25.1.orig/arch/powerpc/kernel/process.c   2008-05-21 
07:25:45.0 -0700
+++ linux-2.6.25.1/arch/powerpc/kernel/process.c2008-05-21 
07:26:55.0 -0700
@@ -219,6 +219,28 @@
 }
 #endif /* CONFIG_SPE */
 
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+
+/* Add support for HW Debug breakpoint.  Use DAC register.  */
+int set_dac(unsigned long dac)
+{
+   mtspr(SPRN_DAC1, dac);
+
+   return 0;
+}
+unsigned int get_dac()
+{
+   return mfspr(SPRN_DAC1);
+}
+int set_dbcr0(unsigned long dbcr)
+{
+   mtspr(SPRN_DBCR0, dbcr);
+
+   return 0;
+}
+
+#endif
+
 #ifndef CONFIG_SMP
 /*
  * If we are doing lazy switching of CPU state (FP, altivec or SPE),
@@ -330,6 +352,13 @@
if (unlikely(__get_cpu_var(current_dabr) != new-thread.dabr))
set_dabr(new-thread.dabr);
 
+
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+   /* If new thread DAC (HW breakpoint) is the same then leave it.  */
+   if (new-thread.dac)
+   set_dac(new-thread.dac);
+#endif
+
new_thread = new-thread;
old_thread = current-thread;
 
@@ -515,6 +544,16 @@
 
discard_lazy_cpu_state();
 
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+   if (current-thread.dac) {
+   current-thread.dac = 0;
+   /* Clear debug control.  */
+   current-thread.dbcr0 = ~(DBSR_DAC1R | DBSR_DAC1W);
+
+   set_dac(0);
+   }
+#endif
+
if (current-thread.dabr) {
current-thread.dabr = 0;
set_dabr(0);
Index: linux-2.6.25.1/arch/powerpc/kernel/ptrace.c
===
--- linux-2.6.25.1.orig/arch/powerpc/kernel/ptrace.c2008-05-21 
07:25:45.0 -0700
+++ linux-2.6.25.1/arch/powerpc/kernel/ptrace.c 2008-05-21 08:23:34.0 
-0700
@@ -629,9 +629,15 @@
 {
struct pt_regs *regs = task-thread.regs;
 
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+   /* If DAC then do not single step, skip.  */
+   if (task-thread.dac)
+   return;
+#endif
+
if (regs != NULL) {
 #if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
-   task-thread.dbcr0 = 0;
+   task-thread.dbcr0 = ~(DBCR0_IC | DBCR0_IDM);
regs-msr = ~MSR_DE;
 #else
regs-msr = ~MSR_SE;
@@ -640,22 +646,83 @@
clear_tsk_thread_flag(task, TIF_SINGLESTEP);
 }
 
-static int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
+static int ptrace_get_debugreg(struct task_struct *task, unsigned long data)
+{
+   int ret;
+
+   #if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+   ret = put_user(task-thread.dac,
+   (unsigned long __user *)data);
+   #else
+   ret = put_user(task-thread.dabr,
+   (unsigned long __user *)data);
+   #endif
+
+   return ret;
+}
+
+int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
   unsigned long data)
 {
-   /* We only support one DABR and no IABRS at the moment */
+   /* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
+  For embedded processors we support one DAC and no IAC's
+  at the moment.  */
if (addr  0)
return -EINVAL;
 
-   /* The bottom 3 bits are flags */
if ((data  ~0x7UL) = TASK_SIZE)
return -EIO;
 
-   /* Ensure translation is on */
+#ifdef CONFIG_PPC64
+
+   /* For processors using DABR (i.e. 970), the bottom 3 bits are flags.
+  It was assumed, on previous implementations, that 3 bits were
+  passed together with the data address, fitting the design of the
+  DABR register, as follows:
+
+  bit 0: Read flag
+  bit 1: Write flag
+  bit 2: Breakpoint translation
+
+  Thus, we use them here as so.  */
+
+   /* Ensure breakpoint translation bit is set.  */
if (data  !(data  DABR_TRANSLATION))
return -EIO;
 
+   /* Move contents to the DABR register.  */
task-thread.dabr = data;
+
+#endif
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+
+   /* Read or Write bits must be set.  */
+   if (!(data  0x3UL))
+  

Re: PPC upstream kernel ignored DABR bug

2008-03-13 Thread Luis Machado
On Wed, 2008-03-12 at 23:30 +0100, Jens Osterkamp wrote:
  Just to make sure, i tested the binary against the 2.6.25-rc4 kernel. It
  still fails. So this is really an open bug for PPC.
 
 On a Cell- or 970-based machine ?
 
 Gruß,
   Jens

On a 970-based machine.

Regards,

-- 
Luis Machado
Software Engineer 
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: PPC upstream kernel ignored DABR bug

2008-03-12 Thread Luis Machado
Hi,

 On the Blade DABRX had to be set additional to DABR. PS3 and Celleb
 already did this. Uli Weigand found this back in November. I submitted
 a patch for this which went into 2.6.25-rc4.
 Can you please try again with rc4 ?

 Gruß,
 
 Jens

Just to make sure, i tested the binary against the 2.6.25-rc4 kernel. It
still fails. So this is really an open bug for PPC.

-- 
Luis Machado
Software Engineer 
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: PPC upstream kernel ignored DABR bug

2008-03-10 Thread Luis Machado
 On the Blade DABRX had to be set additional to DABR. PS3 and Celleb
 already did this. Uli Weigand found this back in November. I submitted
 a patch for this which went into 2.6.25-rc4.
 Can you please try again with rc4 ?

I will try it and will post the results back.

Thanks Jens.

Regards,
-- 
Luis Machado
Software Engineer 
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: PPC upstream kernel ignored DABR bug

2008-03-10 Thread Luis Machado
On Mon, 2008-03-10 at 12:19 -0700, Roland McGrath wrote:
  On the Blade DABRX had to be set additional to DABR. PS3 and Celleb
  already did this. Uli Weigand found this back in November. I submitted
  a patch for this which went into 2.6.25-rc4.
  Can you please try again with rc4 ?
 
 This is not the problem.  This came up before and everyone seems have
 forgotten.  This bug has been reproduced on G5's, which do not have DABRX
 as I understand it.

Yes, now that you mentioned, i've been able to reproduce this on 970FX's
blades, which i don't think have DABRX registers. I guess it's the
almost the same CPU as G5's.

Regards,

-- 
Luis Machado
Software Engineer 
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: PPC upstream kernel ignored DABR bug

2008-03-09 Thread Luis Machado
 Yes, I know. I tried it on the PS3 first and couldn't reproduce
 the bug he saw on the blade.

Arnd,

Do we have any news on this topic? 

I've seen this happening quite often within GDB when using hardware
watchpoints on a shared variable in a threaded (7+ threads) binary.
Sometimes the watchpoint won't trigger, even though the monitored
variable's value was modified.

Appreciate your feedback.

Best regards,

-- 
Luis Machado
LoP Toolchain
Software Engineer 
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev